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Educational  planners  and  policy  makers  need  adequate 
information  about  the  societal  context  of  education  to  make 
appropriate  decisions  about  the  future  role  and  function  of 
education.   Some  of  this  information  may  be  provided  through 
the  use  of  conceptually  sound  social  and  educational  vari- 
ables operationally  defined  as  time  series  indicators 
coupled  with  an  empirically  sound  basis  for  forecasting 
future  trends  in  such  indicators.   As  evidence  of  the  need 
for  developing  such  a  social  forecasting  framework  for 
education,  states  including  Florida  have  provided  grants 
for  that  purpose.   This  study  was  one  aspect  of  such  a 
grant . 

The  problem  in  this  study  was  (a)  to  select,  using 
Bronfenbrenner's  ecology  of  education  model,  and  operation- 
ally define  at  least  10  variables  that  research  has  shown 
to  be  related  to  the  outcomes  of  education;  (b)  to  use  these 
variables  operationally  defined  as  time  series  indicators  in 
the  comparison  of  three  purely  extrapolative  forecasting 


methods;  and  (c)  to  derive  implications  for  the  use  of  an 
ecological  model  such  as  Bronfenbrenner ' s ,  time  series 
indicators,  and  selected  extrapolative  methods  for  educa- 
tional planning. 

The  study  was  conducted  in  the  following  phases: 

1.  Using  Bronfenbrenner ' s  ecology  of  education  model, 
10  variables  that  research  has  shown  to  be  related  to  the 
outcomes  of  education  were  selected  and  where  possible, 
were  operationally  defined  as  state  and/or  national  time 
series  indicators.   Data  were  collected  for  these  indica- 
tors; eight  which  met  the  criteria  established  in  this  study 
were  used  in  the  comparison  of  extrapolative  techniques. 

2.  Three  purely  extrapolative  techniques  derived  from 
the  general  linear  model  were  compared  according  to  statis- 
tical criteria  and  practical  considerations  derived  from 
the  literature  in  statistics,  economics,  time  series 
analysis,  and  forecasting  methodology.   The  methods  were 
(a)  linear  regression,  (b)  curvilinear  regression  (quad- 
ratic and  cubic  forms),  and  (c)  log-linear  regression  (de- 
pendent variable  undergoes  logarithmic  transformation) . 
Each  method  was  applied  to  each  time  series  indicator.   Time 
in  years  was  used  as  the  independent  variable;  the  annual 
measure  of  the  indicator  was  treated  as  the  dependent  vari- 
able.  Each  data  set  was  divided  into  thirds;  two-thirds 

of  the  data  points  were  used  to  establish  the  prediction 
equation.   This  equation  was  used  to  predict  the  remaining 
third  of  the  data  points.   Predicted  values  were  compared 
with  actual  values. 
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3.   Implications  for  the  use  in  educational  planning 
of  an  ecological  model  such  as  Bronf enbrenner ' s ,  time  series 
indicators,  and  selected  extrapolative  techniques  were  dis- 
cussed. 

Results  of  the  method  comparison  were  (a)  no  method 
was  a  superior  predictor  for  all  indicators;  (b)  each 
method  was  a  superior  predictor  for  at  least  one  indicator; 
and  (c)  the  summary  statistics  for  the  original  regression 
were  not  consistently  related  to  the  accuracy  of  the  extra- 
polated values. 

The  following  conclusions  appear  to  be  warranted  by 
the  results  of  this  study: 

1.  The  Bronfenbrenner  model  is  a  useful  framework 
for  considering  the  numerous  factors  impinging  upon  the 
learner. 

2.  Time  series  indicators  provide  a  means  to  compare 
trends  in  an  indicator  over  time  or  to  compare  different 
groups  in  relation  to  a  specific  indicator. 

3.  The  general  linear  model  is  appropriate  for  the 
analysis  and  extrapolation  of  the  selected  time  series 
indicators  used  in  this  study. 

4.  Each  method  is  appropriate  for  use  with  some 
indicators  but  not  with  others.   Measures  of  "best  fit"  such 
as  r2  and  the  standard  error  of  estimate  are  not  reliable 
criteria  for  the  selection  of  an  extrapolative  method.   A 
combination  of  strategies  such  as  graphic  representation  of 
original  and  predicted  data,  analysis  of  residuals,  and 
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knowledge  of  the  social  phenomena  being  studied  may  provide 
guidance  as  to  the  most  appropriate  method  for  a  particular 
indicator. 
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CHAPTER  I 
INTRODUCTION 

Background  and  Significance  of  the  Study 

The  Social  Context  of  Education 

Educators  have  become  increasingly  cognizant  of  the 

myriad  forces  in  society  impinging  upon  various  facets  of  the 

educational  process.   The  influence  of  a  number  of  these 

forces  upon  educational  purposes,  outcomes,  and  resources 

has  been  analyzed  from  several  social  science  perspectives 

(Boocock,  1976;  Henry,  1961;  Gordon,  1974).   Keppel  (in 

Thomas  $  Larson,  1976)  acknowledged  one  of  the  reasons  for 

this  continuing  interest  in  societal  trends  by  educational 

planners  and  policy-makers: 

the  impetus  for  change  in  educational 
institutions,  from  the  preschool  through 
the  university,  is  more  likely  to  derive 
from  changes  in  the  wider  society  than  from 
forces  within  the  institutions.   (Foreword) 

Additionally,  Keppel  noted  that  "educational  policy  must  be 
formed  in  concert  with  other  aspects  of  public  policy  and 
program  development"   (Foreword) . 

Bronfenbrenner  (1976)  proposed  an  ecological  structure 
of  the  educational  environment  which  must  be  taken  into 
account  if  "any  progress  in  the  scientific  study  of  educa- 
tional systems  and  processes"  (p.  5)  is  to  be  made.   Bronfen- 
brenner stated: 

1 


Whether  and  how  people  learn  is  a  function 
of  sets  of  forces,  or  systems,  at  two  levels: 

a.  The  first  comprises  the  relations  between 
characteristics  of  learners  and  their  sur- 
roundings in  which  they  live  out  their 
lives  (e.g.,  home,  school,  peer  group, 
work  place,  neighborhood,  community). 

b.  The  second  encompasses  the  relations 
and  interconnections  that  exist  between 
these  environments.   (p.  5) 

Building  on  Lewin's  theory  of  topological  territories 

and  employing  a  terminology  adapted  from  Brim  (1975)  ,  Bronfen- 

brenner  further  elaborated  that  the  construct  environment  can 

be  "conceived  topologically  as  a  nested  arrangement  of 

structures,  each  contained  within  the  next"  (p.  5). 

1)  A  micro-system  is  an  immediate  setting 
containing  the  learner  .... 

2)  The  meso-system  comprises  the  inter- 
re  latTolrsTiip~s—among  the  major  settings 
containing  the  learner  at  a  particular 
point  in  his  or  her  life  ...  a  system 
of  micro-systems. 

3)  The  exo-system  is  an  extension  of  the 
meso-system  embracing  the  concrete  social 
structures,  both  formal  and  informal,  that 
impinge  upon  or  encompass  the  immediate 
settings  containing  the  learner  and,  there- 
by, influence  and  even  determine  or  delimit 
what  goes  on  there.   These  structures  in- 
clude the  major  institutions  of  society, 
both  deliberately  structured  and  spontane- 
ously evolving,  as  they  operate  at  the 
local  community  level  .... 

4)  Macro -systems  are  the  overarching  insti- 
tutions of  the  culture  or  subculture, 
such  as  the  economic,  social,  educational, 
legal  and  political  systems,  of  which 
local  micro-,  meso-,  and  exo-systems  are 
the  concrete  manifestations.   (pp.  5-6) 

(See  Figure  1  for  a  representation  of 
these  ideas.) 
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Figure  1.   Bronfenbrenner ' s  ecological  structure  of  the 
educational  environment.   (Based  upon  Bronfen 
brenner's  [1976]  description,  pp.  5-6.) 


The  Futures*  Perspective  in  Educational  Planning 

While  Bronfenbrenner  proposed  his  ecological  structure 
primarily  as  a  framework  for  learning  research  efforts,  that 
is,  for  examining  relationships  among  variables  associated 
with  learning,  others  (Harman,  1976;  Webster,  1976)  have  dis- 
cussed the  societal  context  of  education  as  a  framework  for 
future-oriented  educational  planning.   Indeed,  this  emphasis 
on  future  awareness  has  evolved  into  a  significant  movement 
within  education  referred  to  as  educational  futurism  (Hencley 
§  Yates,  1974;  Pulliam  §  Bowman,  1974),  educational  futures 
(Marien  §  Ziegler,  1972),  or  alternative  futures  perspective 
(Webster,  1976).   The  primary  purpose  of  futures  research  or 
future  studies  is  "to  help  policy  makers  choose  wisely--in 
terms  of  their  purposes  and  values- -among  alternative  courses 
of  action  that  are  open  to  leadership  at  a  given  time"  (Shane, 
1973,  p.  1).   According  to  Webster  (1976),  this  requires  that 

we  attend  to  alternatives- -to  alternative 
assumptions,  ends  and  means.   It  requires 
us  to  examine  alternative  plausible  futures 
that  might  be  rendered  more  or  less  possible 
by  our  planning  and  action;  to  identify  un- 
intended as  well  as  intended  consequences 
for  others  of  achieving  the  goals  that  seem 
desirable  to  us;  to  analyze  alternative  stra- 
tegies and  tactics  for  achieving  any  desired 
future;  and  to  anticipate  the  variety  of 
potential  consequences  of  our  strategies, 
tactics,  and  short-run  planning.   Perhaps, 
most  fundamentally  it  asks  of  us  that  we 
look  hard  at  our  basic  premises  about  the 
nature  of  man  and  the  world  and  consider 
implications  and  alternatives  for  the 
future.   (p.  2) 


*"Futures"  refers  to  the  number  of  different  possible 
views  of  what  is  ahead  in  subsequent  time  periods  for  society 
and,  thus,  for  education. 


Webster  also  noted: 

the  futures  perspective  implies  that  we  not 
just  attend  to  alternatives  in  and  for  educa- 
tion, but  also  consider  the  societal  context 
in  more  comprehensive  fashion  than  is  usual 
in  educational  planning.   (p.  2) 

In  order  to  assist  decision  makers  in  the  selection  of 
alternatives  which  have  positive  future  consequences  for 
society,  educational  planners  at  both  national  and  state 
levels  must  take  into  account  those  societal  forces  which 
affect  not  only  the  outcomes  of  education,  but  also  the  pur- 
poses of  education,  and  the  human  and  material  resources 
available  to  the  educational  process.   To  do  this,  however, 
the  planner  must  delineate  the  societal  factors  or  variables 
to  be  included  in  the  planning  process  and  develop  a  sound 
rationale  based  on  research  and  theory  for  such  inclusion. 
Then  trends- -past ,  present,  and  future--in  these  variables 
may  be  examined  in  order  to  derive  implications  for  educa- 
tional planning  and  policy. 

Forecasting  Trends  in  Social  Variables 

Available  to  the  educational  planner  in  this  undertaking 

are  a  number  of  predictive  and  heuristic  devices  to  explore 

alternative  futures  which  have  been  developed  by  government, 

industry,  non-profit  organizations,  and  futures  consulting 

groups.   These  forecasting  techniques  can  be  categorized 

into  exploratory  forecasting  methods  and  normative  forecasting 

methods : 

Exploratory  forecasting  methods  start  from 
the  present  situation  and  its  preceding 
history,  and  attempt  to  project  future 


developments.   Normative  forecasts,  on  the 
contrary,  start  with  some  desired  or  pos- 
tulated future  situation,  and  work  back- 
wards to  derive  feasible  routes  for  the 
transition  from  the  present  to  the  desired 
future.   (Martino,  1976,  p.  4) 

Exploratory  forecasting  methods,  all  of  which  are  based 
upon  extrapolation  of  some  kind,  include  (a)  purely  extra- 
polative  methods,  (b)  explanatory  methods,  and  (c)  auxiliary 
methods.   Since  forecasting  of  social  phenomena  is  still  in 
a  highly  intuitive  developmental  phase,  there  is  a  growing 
interest  in  examining  those  exploratory  methods  considered 
to  be  purely  extrapolative ,  which  are  based  upon  time  series 
data  representing  social  and  educational  variables.   These 
time  series  data,  often  called  time  series  indicators,  are 
defined  measurements  made  at  specified  intervals  over  a 
period  of  time.   By  extrapolating  identified  patterns  in  the 
time  series  data  into  the  future,  planners  may  compare  present, 
past,  and  future  states  of  that  indicator.   Thus,  a  projec- 
tion of  future  societal  trends  can  provide  the  impetus  to 
examine  present  policy  and  to  analyze  the  consequences  of 
contemplated  changes.   This  approach  need  not  be  only  "pre- 
ventive" forecasting,  in  the  sense  used  by  Ziegler  (1972)  of 
preventing  undesirable  forecasts.   It  may  also  be  extended 
to  examine  all  consequences  of  action  or  intervention,  in- 
tended or  not.   Purely  extrapolative  methods,  when  combined 
with  auxiliary  methods  such  as  trend-impact  analysis,  cross- 
impact  matrices,  or  scenarios,  can  provide  a  vehicle  for  ex- 
ploring the  relationships  among  identified  future  patterns  in 
society. 


While  the  use  of  purely  extrapolative  methods  with  time 
series  data  is  fairly  well  defined  in  technological  and 
economic  areas,  their  application  to  social  forecasting  has 
not  been  the  focus  of  significant  definitive  study.   Indeed 
Harrison  (1976)  emphasized  the  need  for  such  research,  speci- 
fically the  consideration  of  "each  method  in  terms  of  some 
aspect  of  the  social  process  it  would  likely  be  applied  to" 
(p.  13).   For,  as  Harrison  explained,  while  some  problems 
in  regression  and  time  series  analysis  which  remain  unresolved 
are  currently  the  concern  of  statisticians  and  mathematicians, 
"it  appears  that  resolution  might  best  lie  in  terms  of  inves- 
tigation in  concrete  application  cases"  (p.  14). 

In  social  forecasting  there  is  a  great  need 
in  almost  all  the  known  extrapolative  methods 
for  an  explicit  statement  of  the  algorithmic, 
theoretical,  and  empirical  weaknesses  or 
sensitivities  of  such  procedures.   Such  a 
discussion,  as  noted,  would  be  more  mean- 
ingful if  carried  on  in  the  context  of  an 
analysis  of  some  specific  aspect  or  aspects 
of  social  process.   (Harrison,  1976,  p.  17) 

Only  through  empirical  study  of  the  performance  of  various 
extrapolative  methods  applied  to  particular  social  phenomena 
will  a  basis  for  selection  of  appropriate  and  accurate  tech- 
niques be  formulated. 

The  Need  for  Research 

Since  there  are  no  widely-accepted  planning  models  in- 
corporating quantitative  data  on  social  variables,  the  edu- 
cational planner  who  wants  to  utilize  such  information  is 
confronted  with  a  number  of  questions  related  to  (a)  the 


identification  of  social  variables  to  be  included,  (b)  the 
operational  definition  of  social  variables  in  terms  of  time 
series  indicators,  (c)  the  selection  of  a  purely  extrapolative 
technique  which  will  yield  the  most  accurate  forecast  for  a 
specific  indicator,  and  (d)  the  utilization  of  these  forecasts 
in  the  planning  process.   Answers  require  futures  research 
which  is  derived  from  a  conceptually  sound  framework  and  is 
pursued  with  methodological  vigor.   As  evidence  of  the  impor- 
tance of  such  investigation  to  the  educational  planner,  the 
State  of  Florida  through  the  Office  of  Strategy  Planning  in 
the  Department  of  Education  funded  in  1976  a  social  fore- 
casting project  (STAR  Project  No.  R5-175)  at  the  University 
of  Florida  for  the  second  year.   The  study  described  herein 
was  part  of  that  effort  to  forecast  social  trends  affecting 
education  in  Florida. 

To  summarize:   Educational  planners  and  policy  makers 
need  adequate  information  to  make  appropriate  decisions  about 
the  role  and  function  of  education  in  creating  improved 
quality  of  life  for  citizens  of  the  future.   The  State  of 
Florida,  in  funding  STAR  Project  No.  R5-175  of  which  this 
study  is  a  part,  acknowledged  that  need.   Part  of  this  in- 
formation may  be  provided  through  the  use  of  conceptually 
sound  social  and  educational  variables  operationally  defined 
as  time  series  indicators  coupled  with  an  empirically  sound 
basis  for  forecasting  future  states  of  such  indicators. 


The  Problem 
The  problem  in  this  study  was  (a)  to  select,  using 
Bronfenbrenner ' s  ecology  of  education  model,  and  operationally 
define  at  least  10  variables  that  research  has  shown  to  be 
related  to  the  outcomes  of  education;  (b)  to  use  these 
variables  operationally  defined  as  time  series  indicators  in 
the  comparison  of  three  purely  extrapolative  forecasting 
methods;  and  (c)  to  derive  implications  for  the  use  of  an  eco- 
logical model  such  as  Bronfenbrenner ' s ,  time  series  indicators, 
and  selected  extrapolative  methods  for  educational  planning. 

Delimitations  and  Limitations 

The  Bronfenbrenner  ecology  of  education  model  was  used 
primarily  as  a  framework  for  the  selection  of  social  and 
educational  variables  and  was  not  evaluated  itself  in  this 
study.   Ten  variables  (e.g.,  socio-economic  status  of  family, 
peer  group  characteristics)  were  selected  to  be  operationally 
defined,  where  possible,  in  terms  of  national  and/or  state 
level  time  series  indicators  (e.g.,  median  family  income, 
juvenile  crime  rates).   Of  these  identified  indicators,  eight 
which  met  the  following  criteria  were  used  in  the  comparison 
of  extrapolative  techniques:   (a)  the  indicator  was  readily 
available,  (b)  the  data  were  available  for  a  10  year  or  greater 
time  span,  and  (c)  the  indicator  was  a  reasonably  reliable  and 
valid  measure  of  one  aspect  of  the  social  or  educational 
variable  that  it  represented.   It  should  be  noted  that 
the  selection  of  the  eight  indicators  used  in  this  study 
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was   in  many  cases  influenced  more  by  data  availability  than 
the  logic  or  appropriateness  of  the  indicator  to  represent  a 
specific  social  variable.   Thus,  the  eight  indicators  are 
examples  of  the  type  of  data  that  might  be  employed  to 
operationally  define  the  variables;  utilization  in  a  specific 
planning  situation  would  require  evaluation  of  the  appropri- 
ateness of  the  indicators  presented  in  this  study  and  the 
addition  and/or  substitution  of  other  indicators. 

In  this  study  only  the  variables  related  to  the  outcomes 
of  education  were  used.   As  previously  noted,  this  study  was 
part  of  a  larger  social  forecasting  and  educational  planning 
effort  which  also  included  the  status  of  education  1976-77, 
social  trends  affecting  the  purposes  of  education,  and  social 
trends  affecting  the  resources  for  education. 

While  the  literature  in  mathematics,  statistics,  and 
economics  was  reviewed  and  considered  in  preparation  for 
the  selection  and  use  of  the  three  extrapolative  techniques 
(linear,  log-linear,  and  curvilinear  regression),  there  was 
no  attempt  to  present  the  comparison  of  these  techniques  in 
the  detail  desired  by  these  disciplines.   Rather  the  compari- 
son was  made  in  such  a  way  as  to  be  most  relevant  to  the 
planner  in  education. 

There  was  no  attempt  to  write  or  adapt  computer  programs 
for  various  techniques.   Instead,  an  effort  was  made  to 
identify  and  utilize  computer  programs  and  statistical  pack- 
ages which  had  already  been  adapted  for  use  at  the  North  East 
Regional  Data  Center's  computer  facilities. 
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Additionally,  the  projection  of  specific  trends  per  se 
was  not  of  interest  in  this  study.   Rather  the  focus  of  this 
study  was  the  development  of  the  conceptual  framework  and 
methodology  for  such  projection.   Also,  there  has  not  been 
any  attempt  to  forecast  educational  outcomes  from  the 
operationally  defined  social  and  educational  variables.   The 
present  work  may  be  considered  an  initial  step  in  determining 
the  feasibility  of  developing  such  a  mathematical  forecasting 
model . 

Definition  of  Terms 


Extrapolative  forecasting. 

The  procedure  consists  of  identifying  an 
underlying  historical  trend  or  cycle  in 
social  processes  that  can  be  extrapo- 
lated by  means  as  varied  as  multiple 
regression  analysis,  time  series  analy- 
sis, envelope  curve  fitting,  three-mode 
factor  analysis,  correlational  analysis, 
averages,  or  any  other  method  that  takes 
current  and  historical  data  as  the  prin- 
cipal basis  for  estimating  future  states 
in  a  given  variable.   (Harrison,  1976, 
p.  3) 

Indicator,  educational. 

Educational  indicators  are  statistics 
that  enable  interested  publics  to  know 
the  status  of  education  at  a  particular 
moment  in  time  with  respect  to  some 
selected  variables,  to  make  comparisons 
in  that  status  over  time  and  to  project 
future  status.   Indicators  are  time-series 
statistics  that  permit  a  study  of  trends 
and  change  in  education.   (Gooler,  1976, 
p.  ID 

Indicator,  social.   "The  operational  definition  or  part 
of  the  operational  definition  of  any  one  of  the  concepts 
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central  to  the  generation  of  an  information  system  descrip- 
tive of  the  social  system"  (Carlisle,  1972,  p.  25);  "time- 
series  that  allow  comparisons  over  an  extended  period  which 
permit  one  to  grasp  long-term  trends  as  well  as  unusually 
sharp  fluctuations"  (Sheldon  £j  Freeman,  1970,  p.  97);  "a 
statistic  of  direct  normative  interest  which  facilitates 
concise,  comprehensive  and  balanced  judgments  about  the  con- 
dition of  major  aspects  of  a  society"  (U.S.  Department  of 
Health,  Education,  §  Welfare,  1970,  p.  97). 

Outcomes  of  education.   Those  measures  of  performance , 
such  as  achievement  test  scores,  or  utilization,  such  as 
employment  rates,  which  appear  to  be  the  result  of  partici- 
pation in  the  formal  educational  process. 

Regression,  linear.    Most  common  type  of  regression  in 
which  the  objective  is  to  locate  the  best-fitting  straight 
line  through  a  scattergram  based  on  interval-level  variables 
(Nie,  Hull,  Jenkins,  Steinbrenner ,  §  Bent,  1975,  p.  278). 

Regression,  log-linear.  As  used  in  this  study,  a  least 
squares  regression  method  in  which  a  geometric  straight  line 
is  located  through  a  scattergram  plotted  on  semi-logarithmic 
paper;  also  called  exponential  curve  or  trend  curve. 

Regression,  polynomial  or  curvilinear.    Regression 
method  for  fitting  a  curve  to  a  set  of  data  using  the  cri- 
terion of  least  squares  distances   (Nie  et  al.,  1975,  p.  278). 

Time  series.   "A  set  of  observations  generated  sequen- 
tially in  time"  (Box  §  Jenkins,  1970,  p.  23). 
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Procedures 

The  study  proceeded  in  the  following  phases:   (a)  using 
Bronfenbrenner ' s  ecology  of  education  model,  10  variables 
that  research  has  shown  to  be  related  to  the  outcomes  of 
education  were  selected  and,  where  possible,  were  operational 
ly  defined  as  time  series  indicators;  (b)  data  were  collected 
for  these  time  series  indicators,  eight  of  which  were  used 
in  the  comparison  of  the  selected  extrapolative  techniques; 
(c)  using  the  selected  time  series  indicators,  three  purely 
extrapolative  techniques  were  compared  according  to  statis- 
tical criteria  and  practical  considerations  derived  from 
the  literature;  and  (d)  implications  for  the  use  in  educa- 
tional planning  of  an  ecological  model  such  as  Bronfen- 
brenner's,  time  series  social  and  educational  indicators, 
and  selected  extrapolative  techniques  were  derived. 

The  Selection  and  Operational  Definition  of  Variables 

The  work  by  Collazo,  Lewis,  and  Thomas  (1977),  completed 
during  the  first  year  of  STAR  Project  No.  R5-175,  on  fore- 
casting selected  educational  outcomes  from  social  variables 
was  utilized.   Since  the  variables  selected  by  these  inves- 
tigators were  derived  from  a  review  of  the  research  litera- 
ture and  Avere  acknowledged  to  be  appropriate  for  the  stated 
social  forecasting  purposes  by  a  panel  of  experts  in  various 
disciplines,  they  appeared  to  fulfill  the  requirements  of 
this  study.   Additionally,  each  of  the  10  variables  selected 
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for  use  was  described  and  classified  according  to  Bronfen- 
brenner's  ecology  of  education  model. 

For  each  variable  an  attempt  was  made  to  identify  one 
or  more  types  of  time  series  indicators  which  might  logically 
represent  the  variable.   For  some  variables  several  indica- 
tors were  identified,  Avhile  for  others,  no  indicator  could 
logically  be  identified  or  no  time  series  data  were  avail- 
able for  the  indicator  at  the  time  of  the  study.   This  phase 
of  the  study  is  explained  further  in  Chapter  II. 

Collection  of  Time  Series  Indicator  Data 

Sources  of  needed  time  series  data  at  both  the  national 
and  state  level  were  identified  in  several  ways.   The  expan- 
ding literature  on  social  trends  (e.g., U.S.  Department  of 
Health,  Education,  $  Welfare,  1970)  and  specifically  the 
literature  on  these  social  trends  operationalized  as  social 
indicators  (e.g.,  Executive  Office  of  the  President,  Office 
of  Management  §  Budget,  1973)  was  reviewed.   Furthermore, 
examination  of  initial  efforts  in  using  time  series  indica- 
tors related  to  education  by  the  Office  of  Technology  Assess- 
ment for  the  United  States  Congress  (Coates,  Note  1)  and 
several  state  departments  of  education  (e.g.,  Oregon,  Penn- 
sylvania, §  Florida)  yielded  additional  sources.   Published 
sources  of  data  such  as  U.S.  Census  Reports  and  Florida 
Statistical  Abstracts  were  consulted.   When  data  did  not 
appear  to  be  available  in  suitable  form  or  for  desired  time 
periods,  inquiries  and  requests  were  directed  to  appropriate 
sources.   Any  apparent  limitations  in  the  data  such  as  known 
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measurement  error  due  to  sampling  technique  were  noted. 
After  data  collection  was  completed,  eight  indicators  which 
met  the  criteria  outlined  in  a  previous  section  were  selected 
for  inclusion  in  the  next  phase  of  the  study. 

Comparison  of  Extrapolative  Methods  Using  Time  Series 
Indicators 

The  following  steps  were  involved  in  this  phase  of  the 
study:   (a)  initial  identification  and  testing  of  methods 
using  data  similar  in  form  to  selected  indicators,  (b)  recon- 
sideration and  testing  of  additional  available  methods,  (c) 
selection  of  three  methods  to  be  used  for  comparative  extra- 
polations, (d)  derivation  of  specific  criteria  and  practical 
considerations  from  the  literature,  (e)  application  of  three 
methods  to  each  data  set,  (f)  extrapolation  of  identified 
trend  into  future  using  equation  generated  in  (e) ,  and  (g) 
comparison  of  actual  versus  predicted  values  of  indicators. 

From  a  preliminary  review  of  the  literature  in  statis- 
tics, economics,  time  series  analysis,  and  forecasting 
methodology,  the  following  four  methods  were  tentatively 
identified  for  comparison:   (a)  linear  regression  (computer 
program  by  Nie  et  al . ,  1975),  (b)  curvilinear  or  polynomial 
regression  (computer  program  by  Nie  et  al.,  1975),  (c)  Box- 
Jenkins  time  series  analysis  (computer  program  by  Cooper, 
Note  2),  and  (d)  FIT  curve- fitting  with  weighted  data  (com- 
puter program  by  Stover,  Note  3). 

An  initial  analysis  of  the  methods  using  trial  sets  of 
data  combined  with  a  visual  analysis  of  the  general  form  of 
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the  data  to  be  used  revealed  that  two  of  the  methods  under 
consideration  were  inappropriate.   The  Box-Jenkins  procedure, 
while  an  extremely  powerful  tool  for  time  series  analysis  of 
data  which  are  characterized  by  seasonal  or  cyclic  variation 
(usually  resulting  in  autocorrelation  of  observations  and 
residuals) ,  did  not  seem  suitable  for  the  social  indicator 
data  collected.   (Should  subsequent  tests  reveal  autocorrela- 
tion and  hence  a  violation  of  the  assumptions  of  the  linear 
model,  Box- Jenkins  could  then  be  appropriately  employed.) 
The  FIT  curve-fitting  procedure  utilizing  a  weighted  data 
principle  was  rejected  because  the  computer  program  required 
extensive  modification  to  yield  necessary  comparative  statis- 
tics and  reliable  output.   Theoretical  justification  for  the 
weighting  formula  and  data  transformations  employed  was 
unavailable . 

Thus,  two  of  the  four  methods  tentatively  considered 
were  rejected.   Since  the  comparison  phase  was  to  involve 
three  methods,  the  literature  was  again  searched  for  other 
appropriate  methods.   The  most  promising  of  these  was  a  curve 
fitting  technique  which  utilizes  an  exponential  function  to 
describe  a  constant  growth  rate.   This  method,  called  log- 
linear  regression  in  this  study,  can  be  described  in  terms 
of  the  general  linear  model  and  solved  by  least  squares  pro- 
cedures when  the  dependent  variable  undergoes  logarithmic 
transformation.   Since  social  phenomena  sometimes  exhibit 
what  appears  to  be  a  constant  growth  rate,  log-linear  regres- 
sion seemed  to  be  an  appropriate  method  to  include  in  this 
study. 
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The  three  methods  finally  selected  for  comparison  were 
(a)  linear  regression  (without  data  transformation) ,  (b) 
curvilinear  or  polynomial  regression,  and  (c)  log-linear 
regression.   The  mathematical  properties  of  each  are  pre- 
sented in  Chapter  III.   All  three  approaches  to  trend  extra- 
polation were  executed  by  using  variations  of  SPSS  subpro- 
grams SCATTERGRAM  and  REGRESSION  and  that  system's  data 
transformation  capabilities  (Nie  et  al.,  1975). 

Each  of  the  three  methods  was  applied  to  each  of  the 
eight  selected  time  series  indicators.   Time  in  years  was 
used  as  the  independent  variable;  the  annual  measure  or 
index  of  the  indicator  was  treated  as  the  dependent  or 
response  variable.   Each  data  set  was  divided  into  thirds; 
two-thirds  of  the  data  points  were  used  to  establish  the 
prediction  equation.   This  equation  was  then  used  to  predict 
the  remaining  third  of  the  data  points.   Predicted  values 
were  then  compared  with  actual  values. 

Thus,  in  this  phase  of  the  study  three  prediction 
equations  (one  for  each  method)  were  generated  for  each  of 
the  eight  time  series  indicators.   Statistical  criteria  de- 
rived from  the  literature  were  used  to  evaluate  the  "good- 
ness of  fit"  of  the  regression  line  derived  from  the  pre- 
diction equation  to  the  data.   The  distribution  of  error 
(residuals)  about  the  regression  line  was  also  examined  to 
determine  if  the  data  satisfied  the  assumptions  of  the  sta- 
tistical model.   Results  of  the  method  comparison  phase  are 
reported  in  Chapter  IV. 
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Development  of  Implications  for  Educational  Planning 

In  Chapter  V  methodological  strategies  involved  in  the 
selection  and  operational  definition  of  variables  are  analyzed 
in  terms  of  viability  for  future  use.   Results  of  the 
technique  comparison  phase  are  analyzed  according  to  the 
statistical  criteria  and  practical  considerations  derived 
from  the  literature  in  forecasting  methodology  and  statis- 
tics.  In  Chapter  VI  a  summary  of  the  study  and  conclusions 
warranted  by  the  results  of  the  study  are  presented.   Future 
directions  for  research  suggested  by  the  results  of  this 
study  are  discussed.   Additionally,  implications  for  the 
use  in  educational  planning  of  an  ecological  model  such  as 
Bronfenbrenner ' s ,  time  series  social  and  educational  in- 
dicators, and  selected  extrapolative  methods  are  discussed. 


CHAPTER  II 

RATIONALE  FOR  SELECTION  OF  VARIABLES/ 
TIME  SERIES  INDICATORS 


In  the  previous  chapter,  the  need  for  educational 
planners  and  policy  makers  to  have  an  awareness  of  the 
societal  context  of  education  was  emphasized.   To  this  end 
the  Bronfenbrenner  ecology  of  education  model  was  proposed 
as  a  framework  for  the  selection  of  social  variables  which 
affect  the  outcomes  of  the  educational  process.   The  selected 
social  variables  may  then  be  operationalized  as  time  series 
indicators;  trends  in  these  indicators  can  be  identified 
and  extrapolated  into  the  future.   Such  information  might 
then  be  incorporated  into  a  planning  model  in  order  to  assist 
planners  and  policy  makers  in  making  informed  decisions  about 
the  role  and  function  of  education  in  the  future. 

In  order  to  place  the  use  of  time  series  indicators 
described  in  this  study  into  perspective,  in  the  first 
section  of  the  present  chapter  social  indicators  are  dis- 
cussed in  relation  to  their  historical  development,  defini- 
tion and  use,  and  data  base.   Educational  applications  of 
indicators  are  briefly  noted.   In  the  second  section  the 
social  variables  selected  for  use  in  this  study  are  presented 
in  relation  to  the  Bronfenbrenner  model.   These  variables 
are  then  operationally  defined  as  time  series  indicators, 
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and  the  eight  indicators  selected  for  use  in  the  comparison 
of  the  three  extrapolative  methods  are  listed. 

The  Social  Indicator  Movement 

Historical  Development 

Interest  in  societal  trends  by  policy  planners  is  not 
of  recent  origin  in  the  United  States.   Indeed,  in  1933  a 
presidential  task  force  reported  on  social  trends  in  a  com- 
prehensive work  documenting  social  change  in  the  United  States 
(President's  Research  Committee  on  Social  Trends,  1933).   The 
development  of  indicators,  or  measures,  of  social  change, 
however,  did  not  receive  the  sustained  governmental  support 
that  was  provided  for  indicators  of  the  economic  process. 
Thus,  while  the  development  of  economic  statistics  during  the 
1930's  and  1940"s  provided  "a  solid  basis  for  economic 
analysis  and  economic  reporting  which  eventually  resulted  in 
the  establishment  of  the  Council  of  Economic  Advisors  and 
the  Economic  Report"  (U.S.  Department  of  Health,  Education, 
$  Welfare,  1970,  p.  v)  ,  comparable  development  of  social  in- 
dicators was  not  undertaken. 

In  the  1960's  a  renewed  interest  in  statistics  describing 
the  social  condition  became  apparent.   Impetus  for  the  de- 
velopment of  social  indicators  was  provided  by  social 
scientists  in  various  disciplines,  government  policy  makers, 
and  business  leaders  in  the  private  sector  (Brooks,  1972,  p. 
1) .   While  this  early  effort  was  not  well  defined  as  to 
membership,  organization,  or  objectives,  the  participants  in 
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the  social  indicator  movement  "sensed  great  needs  and  oppor- 
tunities for  change,  [and]  celebrated  shared  but  necessarily- 
ambiguous  symbols"  (Sheldon  §  Parke,  1975,  p.  693). 

The  following  examples  were  drawn  from  the  many  mani- 
festations of  interest  in  the  development  of  social  indica- 
tors during  the  period  from  1965  through  1975  (over  1000  items 
were  listed  in  a  bibliography  issued  in  late  1972  by  Wilcox, 
Brooks,  Beal,  §  Klongian) : 

1.  The  Russell  Sage  Foundation  commissioned  in  1965 
and  published  in  1968  an  independent  study,  Indicators  of 
Social  Change:  Concepts  and  Measurements,  on  a  number  of 
aspects  of  structural  change  in  society  (Sheldon  §  Moore, 
1968) . 

2.  A  study  on  ways  to  measure  the  impact  of  massive 
scientific  and  technological  change  on  society  (Bauer,  1966) 
was  prepared  by  the  American  Academy  of  Arts  and  Sciences 
for  the  National  Aeronautics  and  Space  Administration.   This 
work,  Social  Indicators,  was  an  overview  of  the  task  of 
developing  indicators  as  part  of  a  feedback  mechanism  docu- 
menting social  change. 

3.  President  Johnson  in  March  of  1966  directed  the 
Secretary  of  Health,  Education  and  Welfare  "to  develop  the 
necessary  social  statistics  and  indicators  to  supplement 
those  prepared  by  the  Bureau  of  Labor  Statistics  and  the 
Council  of  Economic  Advisors"  (U.S.  Department  of  Health, 
Education,  $  Welfare,  1970,  p.  iii).  The  result  of  this 
directive,  Toward  a  Social  Report  issued  in  1969,  was 
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considered  "a  preliminary  step  toward  the  evolution  of  a 
regular  system  of  social  reporting"  (U.S.  Department  of 
Health,  Education,  §  Welfare,  1970,  p.  iii). 

4.  The  Social  Science  Research  Council  established 
in  1972  the  Center  for  Coordination  of  Research  on  Social 
Indicators,  whose  objective  is  "to  enhance  the  contribution 
of  social  science  research  to  the  development  of  a  broad 
range  of  indicators  of  social  change"  (World  Future  Society, 
1977,  p.  97). 

5.  The  appearance  of  the  landmark  government  publica- 
tion, Social  Indicators  1973  (Executive  Office  of  the  Presi- 
dent: Office  of  Management  §  Budget,  1973), was  heralded  as  a 
significant  attempt  to  provide  a  collection  of  social  sta- 
tistics describing  quality  of  life  in  the  United  States. 
This  work,  which  was  scheduled  to  be  up-dated  every  three 
years,  was  a  compilation  of  statistics  on  eight  major  areas 
of  social  interest:   health,  public  safety,  education,  em- 
ployment, income,  housing,  leisure  and  recreation,  and  popu- 
lation.  (It  may  be  noted  that  it  is  often  impossible  to 
strictly  categorize  what  is  social  and  what  is  economic,  as 
almost  all  aspects  of  life  are  the  result  of  interaction  be- 
tween social  and  economic  forces.) 

The  idea  of  systematic  collection  and  use  of  social 
indicators  has  not  always  met  a  favorable  reception.   Bezold 
(Note  4)  and  Shostak  (Note  5)  documented  the  unsuccessful 
efforts,  beginning  in  1967,  by  Walter  F.  Mondale  to  earn 
congressional  approval  of  a  far-reaching  plan  for  new 
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government  use  of  the  applied  social  sciences.   Mondale's 
blueprint  for  better  collection  and  use  of  social  intelli- 
gence involved  two  statutorily-mandated  additions  to  the 
Executive  Office  of  the  President:   (a)  a  Council  of  Social 
Advisors  (CSA)  comparable  to  the  Council  of  Economic  Advisors 
established  in  1946,  and  (b)  an  annual  Social  Report  of  the 
President  prepared  by  the  CSA  to  parallel  the  annual  Economic 
Report  to  the  President.   While  aspects  of  Mondale's  plan  may 
be  satisfied  by  such  efforts  as  Social  Indicators  1973  and 
several  congressional  provisions  which  required  the  develop- 
ment and  application  of  social  science  techniques  to  the 
study  of  present  and  future  national  problems  (Shostak,  Note 
5),  the  comprehensive  nature  of  the  Mondale  plan  is  absent. 
The  future  of  such  governmental  efforts  at  social  accounting 
was  in  1977  uncertain. 

Interest  in  social  indicators  has  not  been  confined  to 
the  United  States  (Johnson,  1975;  Sheldon  §  Parke,  1975). 
In  1973,  for  example,  the  Organization  for  Economic  Coopera- 
tion and  Development  (OECD)  to  which  the  United  States  also 
belongs   issued  a  list  of  social  concerns  shared  by  many 
member  countries.   The  identification  of  concerns  was  a  first 
step  in  the  development  of 

a  set  of  social  indicators  designed 
explicitly  to  reveal,  with  validity, 
the  level  of  well-being  for  each  social 
concern  in  the  list  and  to  monitor 
changes  in  those  levels  over  time. 
(OECD,  1973,  p.  4) 

Additionally,  international  organizations  such  as  the 

Conference  of  European  Statisticians,  the  United  Nations 
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Research  Institute  for  Social  Development,  and  the  United 
Nations  Educational,  Scientific, and  Cultural  Organization 
have  been  actively  concerned  with  social  indicators  (Sheldon 
§  Parke,  1975).   Efforts  to  develop  social  indicators  have 
been  initiated  in  countries  such  as  France,  Great  Britain, 
West  Germany,  Canada,  Japan,  Norway,  Sweden,  and  Denmark 
(Brooks,  1972;  Johnson,  1975). 

Definition  and  Use  of  Social  Indicators 

The  social  indicator  movement  has  been  characterized 
by  ambiguity  of  definition  and  purpose  due,  in  part,  to  the 
heterogeneous  nature  of  participants  with  their  own  back- 
grounds, skills,  and  interests,  and  also,  to  the  necessary 
stages  of  evolution  that  such  a  movement  experiences.   These 
problems  in  definition  and  purpose  of  social  indicators  have 
been  discussed  by  a  number  of  critics  (Land,  1971;  Little, 
1975;  Plessas  $  Fein,  1972;  Sheldon  §  Freeman,  1970;  Sheldon 
§  Land,  1972) . 

Attempts  have  been  made  to  resolve  a  number  of  these 

problems.   Land  (1971),  for  example,  proposed  the  following 

social  science-oriented  definition  of  social  indicators: 

social  indicators  refer  to  social  sta- 
tistics  that  (T)  are  components  in  a 
social  system  model  (including  sociopsy- 
chological,  economic,  demographic,  and 
ecological)  or  of  some  particular  segment 
or  process  thereof,  (2)  can  be  collected 
and  analyzed  at  various  times  and  accumu- 
lated into  a  time  series,  and  (3)  can  be 
aggregated  or  disaggregated  to  levels 
appropriate  to  the  specifications  of  the 
model.  .  .  .   The  important  point  is  that 
the  criterion  for  classifying  a  social 
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statistic  as  a  social  indicator  is  its 
informative  value  which  derives  from  its 
empirically  verified  nexus  in  a  con- 
ceptualization of  a  social  process. 
(p.  323) 

Part  of  the  confusion  over  definition  is  the  result 
of  disagreement  over  purposes,  or  uses,  of  social  indicators, 
These  purposes,  or  uses,  have  been  considered  under  a  number 
of  overlapping,  sometimes  synonomous ,  headings:   (a)  descrip- 
tions reporting,  (b)  policy  planning,  (c)  social  accounting, 
(d)  program  evaluation,  (e)  social  modeling,  (f)  social  fore- 
casting, and  (g)  social  engineering.   V/hile  the  ultimate  ob- 
jective of  guiding  social  policy  is  rarely  disputed,  the 
form  of  this  guidance  is  still  debated.   Social  scientists 
are  more  likely  to  be  concerned  with  the  analysis  and  pre- 
diction of  social  change,  while  public  administrators  and 
legislators  are  often  more  concerned  with  uses  of  indicators 
related  to  public  program  evaluation  and  agency  goal  setting. 
Sheldon  and  Parke  (1975)  in  acknowledging  these  concerns, 
said: 

It  is  apparent  that  many  different 
types  of  work  go  on  under  the  rubric 
of  social  indicators.   What  is  impor- 
tant is  that  the  field  be  seen  as  an 
arena  for  long-term  development,  as 
an  effort  of  social  scientists  to  push 
foreward  developments  in  concepts  and 
in  methodology  that  promise  payoffs 
to  both  science  and  public  policy, 
(p.  698) 

To  underscore  this  point,  Sheldon  and  Parke  (1975)  selected 

an  observation  by  Duncan: 

The  value  of  improved  measures  of 
social  change.  .  .is  not  that  they 
necessarily  resolve  theoretical  issues 
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concerning  social  dynamics  or  settle 
pragmatic  issues  of  social  policy, 
but  that  they  may  permit  those  issues 
to  be  argued  more  productively, 
(p.  698) 


Data  Base  for  Social  Indicators 

Various  efforts  have  been  undertaken  to  improve  the 
data  base  for  social  indicators.   Among  the  efforts  in  the 
early  1970's  were  basic  surveys  on  crime  and  education  as 
well  as  replications  of  previous  social  science  studies  and 
surveys  (Sheldon  §  Parke,  1975). 

Most  social  statistics,  available  Drimarily  from  govern- 
ment sources,  are  objective  in  nature;  that  is,  they  measure 
the  frequency  of  occurrence  of  an  attribute  or  commodity  in 
the  population.   Numbers  of  births,  deaths,  marriages,  years 
of  schooling,  and  percent  of  occupied  housing  with  television 
sets  could  thus  be  considered  objective  measures.  '  (Some  would 
disagree,  however,  with  the  objectivity  of  these  measures, 
see  Andrews  5  Withey,  1976,  p.  5.) 

Several  researchers  (Andrews  $  Withey,  1976;  Campbell, 
Converse,  §  Rogers,  1976)  have  attempted  to  measure  people's 
perceptions  of  their  well-being,  their  quality  of  life.   Such 
measures  collected  on  a  regular  basis  are  expected  to  be 
valuable  supplements  to  the  usual  objective  quality  of  life 
indicators.   (See  for  examples  of  the  latter:  Liu,  1976; 
Thompson,  1976b,  1977.) 

Creation  of  a  social  indicator  data  base  is  not  without 
conceptual  and  methodological  problems.   Various  aspects  of 
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the  social  measurement  problem  have  been  acknowledged  in  the 
literature  (see,  for  example,  de  Neufville,  1975,  pp.  175-179; 
Etzioni  §  Lehman,  1969;  Social  Measurement,  1972).   While  de- 
tailed discussion  of  measurement  dysfunction  (in  the  termi- 
nology of  Etzioni  §  Lehman,  1969)  is  beyond  the  scope  of  this 
study,  the  following  observation  might  be  kept  in  mind: 

Increased  investment,  intellectual  as  well 
as  financial,  no  doubt  can  go  a  long  way  to 
increase  the  efficacy  of  social  measurements 
and  to  reduce  much  of  the  likelihood  of 
dysfunctions.   But,  in  the  final  analysis, 
these  problems  can  never  be  eliminated  en- 
tirely.  Here,  the  client  of  systematic 
measurement  and  accounting  should  be  alerted 
to  the  limitations  of  social  indicators, 
both  to  make  his  use  of  them  more  sophisti- 
cated and  to  prevent  him  from  ultimately 
rejecting  the  idea  of  social  accounting  when 
he  encounters  its  limitations.   (Etzioni  § 
Lehman,  1969,  p.  62) 

Educational  Implications 

Educational  indicators,  a  subset  of  social  indicators, 
have  traditionally  been  measures  of  the  educational  system's 
inputs  and  outputs  stated  in  such  terms  as  numbers  of  tea- 
chers, per  pupil  expenditures,  and  achievement  test  scores. 
There  have  been  attempts,  however,  to  broaden  this  base  of 
educational  statistics  to  include  both  objective  and  sub- 
jective indicators  under  the  categories  of  access,  aspirations, 
achievement,  impact,  and  resources  (Cooler,  1976,  p.  15). 
There  have  also  been  attempts  to  link  indicators  of  social 
processes  (e.g.,  divorce  rates,  voting  rates)  to  educational 
goals  and  thus  to  establish  accountability  measures,  albeit 
remote,  external  to  the  educational  system  (Clemmer,  Fairbanks, 
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Hall,  Impara,  $  Nelson,  1974;  Collazo,  Lewis,  5  Thomas,  Note 
6;  Grady,  1974).   The  use  and  abuse  of  indicators  in  an  edu- 
cational setting,  however,  remained  in  1977  a  matter  of  debate 
(Impara,  Note  7)  and  cautious  optimism  (Hall,  Note  8).   Hope- 
fully, investigations  of  the  problem,  such  as  that  described 
in  this  study,  will  provide  some  guidance  as  to  the  most 
promising  applications  of  social  indicators  to  education. 

Selection  of  Variables/Time  Series  Indicators 

The  Variables 

In  the  first  year  (Sept.  1975-June  1976)  of  Florida 
Department  of  Education  STAR  Project  R5-175  on  social  fore- 
casting for  educational  planning,  trends  in  five  indicators 
of  educational  outcomes  were  forecast.   In  order  to  do  this, 
it  was  necessary  to  identify  variables  that  influence  the 
outcomes  of  education.   Through  a  review  of  the  research  and 
theoretical  literature,  a  number  of  social  variables  were 
identified.   This  list  was  refined  by  an  interdisciplinary 
panel  of  experts  at  the  University  of  Florida  to  the 
following  10  variables:   (a)  socio-economic  status;  (b)  family 
expectations,  attitudes,  and  aspirations;  (c)  student's  self- 
concept;  (d)  student's  general  ability;  (e)  student's  sense 
of  fate  control;  (f)  student's  attitudes  and  motivation;  (g) 
peer  group  characteristics;  (h)  teacher  expectations;  (i) 
teacher  behavior  in  the  classroom;  and  (j)  administrative 
leadership  style.   Collazo  et  al.  (1977)  said  that  only  the 
variables  (a)  and  (d)  received  strong  support  from  research; 
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a  number  of  the  other  variables,  while  "identified  as  impor- 
tant in  the  theoretical  literature.  .  .had  inconclusive 
support  from  research"  (p.  298).   (See  Collazo,  Lewis,  $ 
Thomas,  Note  9,  for  a  review  of  the  research  literature  on 
variables  affecting  educational  outcomes.) 

The  panel  of  experts  was  further  utilized  to  forecast 
the  future  trends  of  these  variables  and  their  effect  on 
specified  performance  and  utilization  measures  of  the  out- 
comes of  education.   Cross-impact  analysis,  a  computer  assist- 
ed modification  of  the  Delphi  forecasting  technique,  was  then 
used  by  the  panel  to  generate  the  future  trends  in  the  five 
outcome  indicators. 

The  framework  for  looking  at  the  future  established 
during  these  first  year  project  activities  is  utilized  in 
the  present  study.   Previous  forecasting  activities  were  based 
primarily  on  the  subjective  judgment  of  panel  participants. 
In  this  study,  however,  the  feasibility  of  using  time  series 
data,  where  available,  as  the  basis  for  forecasting  future 
trends  in  the  10  variables  affecting  educational  outcomes  is 
examined.   In  addition,  the  use  of  a  model  containing  the 
selected  variables  is  considered. 

Bronfenbrenner ' s  Ecology  of  Education  Model 

In  the  previous  section,  the  10  variables  affecting 
educational  outcomes  which  were  derived  from  the  research 
literature  were  presented.   How  can  these  variables  be  put 
into  perspective  as  social  forces  influencing  what  the  stu- 
dent learns? 
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The  Bronf enbrenner  (1976)  model  which  was  presented  in 
Chapter  I  (pp.  1-3)  is  a  multi-dimensional  ecological  struc- 
ture of  the  educational  environment.   At  the  center  of  the 
interacting  meso-,  exo-  and  macro-systems  is  the  micro-system, 
"the  immediate  setting  containing  the  learner"  (Bronf enbrenner , 
1976,  p.  5).   The  meso-system  is  actually  a  system  of  micro- 
systems; that  is,  it  "comprises  the  inter-relationships  among 
the  major  settings  containing  the  learner  at  a  particular 
point  in  his  or  her  life"  (Bronfenbrenner ,  1976,  p.  5).   Some 
of  the  social  variables  that  were  identified  previously  could 
be  considered  as  part  of  the  meso-system.   The  home,  for  ex- 
ample, is  represented  by  socioeconomic  status  and  family 
expectations,  attitudes,  and  aspirations;  the  peer  group  by 
peer  group  characteristics;  and  the  school  by  teacher  expec- 
tations, teacher  behavior  in  the  classroom,  and  administra- 
tive leadership  style.   The  other  variables:   student's  self- 
concept,  student's  general  ability,  student's  sense  of  fate 
control,  and  student's  attitudes  and  motivation  are  all  di- 
rectly related  to  the  learner. 

Bronfenbrenner  (1976)  proposed  that  learning  is  a  func- 
tion of  (a)  the  dynamic  relationship  between  characteristics 
of  the  learners  and  their  various  surroundings  (meso-system) 
and  (b)  the  interaction  between  these  various  environments 
(e.g.,  home,  school,  peer  group).   The  Bronfenbrenner  ecology 
of  education  model  thus  appears  to  provide  the  necessary 
framework  to  support  use  of  the  presently  identified  variables 
and  to  generate  directions  for  future  forecasting  research. 
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Operational  Definition  of  Variables  as  Time  Series  Indicators 

In  previous  sections  10  variables  affecting  educational 
outcomes  were  presented  and  then  classified  according  to  the 
Bronfenbrenner  ecology  of  education  model.   In  order  to  iden- 
tify trends  in  these  variables  and  to  extrapolate  these  trends 
into  the  future,  it  was  necessary  to  operationally  define 
these  variables  as  time  series  measures,  or  indicators.   Since 
some  of  these  variables  were  expressed  in  general  terms,  it 
seemed  necessary  to  try  to  represent  each  by  a  number  of 
measures  and  thus  avoid  "fractional  measurement"  which  is 
often  a  concern  when  operationally  defining  a  social  concept 
(Etzioni  §  Lehman,  1969). 

Several  problems  became  apparent  in  operationalizing 
the  variables: 

1.  A  number  of  indicators  were  identified  for  the 
variables  (a)  socioeconomic  status;  (b)  family  expectations, 
attitudes,  and  aspirations;  and  (c)  peer  group  characteristics. 
For  some  indicators,  however,  data  were  not  collected  annually; 
for  others,  measures  were  not  comparable  over  time  due  to  a 
different  basis  for  measurement. 

2.  For  the  variables  related  to  the  school  and  student 
characteristics  (except  student  attitudes  and  motivation), 

no  time  series  data  which  met  the  criteria  for  selection  were 
available . 

3.  Operational  definitions  were  in  many  cases  influenced 
by  the  availability  of  indicators  rather  than  the  logic  or 
appropriateness  of  the  indicator  to  measure  the  social 
concept  it  represented. 
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The  social  variables,  examples  of  indicators  that  might 
be  used  to  operationally  define  these  variables,  and  sources 
of  the  available  time  series  data  are  presented  in  Table  1. 
The  following  eight  indicators  which  met  the  criteria  estab- 
lished for  this  study  (see  p.  9)  were  selected  for  use  with 
the  three  extrapolative  methods  described  in  Chapter  III: 

1.  Median  family  income  in  the  United  States  ex- 
pressed in  1971  constant  dollars. 

2.  Number  of  families  in  the  United  States  headed  by 
women  expressed  as  a  percentage  of  total  families. 

3.  Number  of  wives  in  the  labor  force  expressed  as 
a  percentage  of  total  wives  in  the  United  States. 

4.  Number  of  marriages  in  Florida  expressed  as  rate 
per  1,000  population  in  Florida. 

5.  Number  of  dissolutions  of  marriage  in  Florida  ex- 
pressed as  rate  per  1,000  population  in  Florida. 

6.  Number  of  resident  live  births  in  Florida  ex- 
pressed as  rate  per  1,000  population  in  Florida. 

7.  Number  of  3  to  5  year  olds  enrolled  in  nursery 
school  and  kindergarten  expressed  as  percentage  of  total 
children  3  to  5  years  old  in  the  United  States. 

8.  Number  of  children  involved  in  divorce  or  annulment 
expressed  as  rate  per  1,000  children  under  18  years  old  in 
the  United  States. 

While  rates  or  percentages  are  used  for  forecasting 
purposes,  the  magnitude  of  the  actual  numbers  should  be  kept 
in  mind  before  interpretation  of  an  identified  trend  is 
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attempted.   Furthermore,  it  is  necessary  to  remember  that 
since  the  population  base  increased  over  the  decades  covered 
by  the  data,  a  stable  rate  or  percentage  still  represents 
larger  absolute  numbers  of  the  phenomenon.   The  indicators 
selected  are  aggregated  to  either  the  state  or  national  level; 
the  appropriate  level  of  aggregation  would,  of  course,  depend 
upon  the  specific  planning  activity.   These  indicators  could 
be  disaggregated  by  race,  age,  region,  or  sex  (where  appro- 
priate) for  comparative  analysis,  and  indeed  this  feature  is 
a  necessary  characteristic  in  many  of  the  definitions  of 
social  indicators  (e.g.,  see  definition  by  Land,  1971,  pre- 
sented earlier  in  this  chapter). 


CHAPTER  III 

RATIONALE  FOR  EXTRAPOLATIVE  METHODS 
SELECTED  FOR  COMPARISON 


One  of  the  purposes  of  this  study  was  to  compare  three 
purely  extrapolative  methods  which  could  be  used  with  social 
indicator  data  of  the  type  described  in  Chapter  II  to  fore- 
cast future  values  of  those  indicators.   In  order  to  select 
methods  which  were  appropriate  for  this  purpose,  both  the 
general  forecasting  literature  and  forecasting  applications 
of  extrapolative  techniques  in  specific  areas  were  reviewed. 
Detailed  descriptions  of  each  technique  as  well  as  statisti- 
cal assumptions,  sensitivity,  and  evaluative  criteria  were 
derived  primarily  from  the  literature  in  economic  statistics 
and  regression  analysis.   The  following  sections  provide  (a) 
an  overview  of  the  extrapolative  methods  used  in  forecasting, 
(b)  evaluation  of  the  applicability  of  these  methods  for  the 
purpose  of  this  study,  (c)  a  description  of  the  three  methods 
selected  for  comparison,  including  equations,  parameters  to 
be  estimated,  assumptions,  and  criteria  to  be  used  in  the 
comparison  of  the  three  methods. 

Overview  of  Extrapolative  Forecasting  Methods 

An  extrapolative  forecasting  method  is  a  procedure  for 
(a)  identifying  an  underlying  historical  trend  or  cycle  in 
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time  series  data,  and  (b)  estimating  future  states  of  a 
variable  based  on  current  and  historical  observations/mea- 
sures of  that  variable  (Harrison,  1976).   Extrapolation  pro- 
vides a  "surprise  free"  projection  of  the  future,  but  not 
necessarily  a  future  which  is  a  bigger  and  better  (or  worse) 
version  of  the  present.   Martino  (1976)  noted  that 

some  extrapolation  methods  allow  the 
forecaster  to  identify  policy  variables 
which  are  subject  to  manipulation  and 
which  allow  the  decision-maker  to  alter 
the  future  away  from  today's  pattern 
of  events.   (p.  4) 

In  the  social  realm  extrapolation  of  trends  may  at  least 
alloiv  the  planner  or  policy  maker  to  make  enlightened  de- 
cisions to  prepare  for  the  future. 

Economic  and  Business  Forecasting 

Because  of  the  impetus  in  the  1930's  and  1940's  to 
describe  and  forecast  the  economic  condition,  many  extra- 
polative  methods  were  developed  with  economic  applications 
in  mind.   Greenwald  (1963,  p.  187)  classified  methods  for 
determining  economic  trends  into  (a)  non-mathematical  methods 
such  as  freehand  curve  fitting,  first-order  differences,  semi- 
averages,  selected  points,  and  weighted  and  unweighted  moving 
averages;  and  (b)  mathematical  methods  such  as  least  squares, 
moments,  maximum  likelihood,  and  others.   In  general,  only 
the  mathematical  methods,  which  include  a  widely  diverse 
array  of  complex  curve- fitting  techniques,  seem  to  be  relied 
upon  for  forecasting  purposes  while  the  non-mathematical 
methods  are  used  for  preliminary  analysis  of  the  shape  of  the 
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time  series  data.   (For  descriptions  of  these  methods,  see 
Greenwald,  1963;  Mayes  §  Mayes,  1976;  Mendenhall  $  Reinmuth, 
1971;  Neiswanger,  1956;  Tuttle,  1957.) 

Approaches  to  governmental/national  economic  forecasting 
(e.g.,Theil,  1966)  often  reach  a  relatively  high  level  of 
mathematical  and  theoretical  sophistication.   This  appears 
to  be  the  result  of  decades  of  development,  of  applying 
method  in  light  of  theory,  and  developing  both  in  turn.   It 
is  also  the  result  of  substantial  investment  of  financial 
and  manpower  resources  by  both  government  and  industry. 

The  value  of  extrapolative  forecasting  to  individual 
decision-makers  in  business  has  become  apparent  (Makridakis, 
Hodgsdon,  5  Wheelwright,  1974).   Indeed,  companies  of  all 
sizes  are  compelled  to  make  forecasts  for  a  number  of  varia- 
bles which  affect  them.   Makridakis  et  al.  (1974)  have  noted, 
however,  that 

as  with  the  development  of  most  management 
science  techniques,  the  application  of 
these  [extrapolative  forecasting]  methods 
has  lagged  behind  their  theoretical  formu- 
lation and  verification.   (p.  153) 

Thus,  the  authors  observed  that  while  the  need  for  forecasting 
methods  is  recognized  by  managers  in  business,  few  are  famil- 
iar with  the  numerous  techniques  available  and  their  charac- 
teristics in  order  that  the  one  most  appropriate  for  a  given 
situation  be  selected.   To  help  meet  this  need,  Makridakis 
et  al.  have  developed  an  interactive  forecasting  system 
(called  Interactive  Forecasting  [SIBYL/RUNNER])  which  allows 
a  number  of  factors  to  be  considered  in  the  selection  of  a 
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forecasting  technique  for  a  given  set  of  data.   Although 
the  system  has  been  well  tested  in  teaching  situations,  it 
has  not  had  extensive  application  in  actual  business  settings. 
Quantitative  techniques  available  in  the  Interactive  Forecas- 
ting (SIBYL/RUNNER)  system  fall  under  the  general  headings  of 
smoothing,  decomposition,  control,  regression,  and  other 
techniques.   The  techniques  considered  under  those  headings 
are  clearly  explained  in  a  subsequent  work  of  two  of  the 
authors  (Wheelwright  §  Makridakis,  1977). 

Technological  Forecasting 

Martino  (1973b;  1976)  described  the  extrapolative  methods 
most  commonly  used  in  technological  forecasting  in  relation 
to  the  shape  of  their  fitted  curves:   (a)  growth  curve,  an 
S-shaped  curve,  which  requires  the  setting  of  an  upper  limit; 
(b)  trend  curve,  an  exponential  function  which  takes  the  form 
of  a  straight  line  when  logarithmic  transformation  of  the 
data  is  undertaken.   Martino  (1973b)  illustrated  the  use  of 
the  growth  curve  with  data  on  lowest  temperature  achieved  in 
the  laboratory  by  artificial  means  and  the  trend  curve  with 
data  on  productivity  in  the  aircraft  industry. 

It  should  be  noted  that  both  the  growth  curve  and  the 
trend  curve  applied  to  technological  change  by  Martino  (1973b) 
are  highly  versatile  approaches  with  applications  in  a  number 
of  disciplines.   Both  methods  are  derived  from  the  least 
squares  formula  for  a  straight  line.   The  growth  curve  is 
a  modified  exponential,  that  is,  it  represents  a  variable 
which  changes  at  a  changing  rate;  the  trend  curve  is  a 
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geometric  straight  line  which  represents  a  variable  which 
changes  at  a  constant  rate  (Neiswanger,  1956). 

Educational  Forecasting 

Uses  of  extrapolative  methods  in  education  have  generally- 
been  limited  to  projections  of  expenditures,  school  enroll- 
ments, and  the  number  of  instructional  staff,  high  school 
graduates,  and  earned  degrees.   While  many  states  and  school 
districts  have  developed  their  own  models,  especially  for 
projections  of  enrollments,  the  National  Center  for  Educa- 
tion Statistics  (U.S.  Department  of  Health,  Education,  § 
Welfare,  1977c)  in  developing  projections  of  education  statis- 
tics to  1985-86  relied  on  regression  methods  wherever  a  trend 
could  be  established.   Specifically,  either  arithmetic 
straight  lines  or  logistic  growth  curves,  depending  upon  the 
nature  of  data,  were  fitted  by  the  method  of  least  squares. 
The  following  was  noted,  however: 

For  both  the  straight  line  and  logistic 
growth  curve,  the  fitted  curve  often  lies 
considerably  above  or  below  the  last  ob- 
served point,  resulting  in  an  unusual 
rise  or  drop  from  the  last  actual  observa- 
tion.  To  avoid  this  and  give  face  validity 
to  the  projections,  the  fitted  curve  was 
used  only  to  establish  the  last  point, 
and  a  new  curve  was  drawn  through  the  last 
observed  ratio  and  the  end  point  on  the 
fitted  curve.   (U.S.  Department  of  Health, 
Education,  §  Welfare,  1977c,  p.  92) 

Brown  (1974)  summarized  the  use  of  trend  analysis  methods 
in  education  and  noted  their  potential  applications  in  educa- 
tional administration.   The  four  extrapolative  methods  that 
he  critiqued  were  (a)  arithmetic  straight  line  extrapolation, 
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(b)  time  series  analysis  (really  a  simplified  version  of  the 
Box-Jenkins  technique),  (c)  the  S-shaped  growth  curve,  and 
(d)  cohort  analysis  (actually  the  trend  curve  described  by 
Martino  in  the  previous  section) .   The  examples  selected  by 
Brown  do  not  reveal  the  versatility  of  the  methods  illustra- 
ted; he  did,  however,  provide  a  comprehensive  review  of 
literature  describing  applications  in  other  fields.   A  number 
of  methodological  concerns  raised  by  Brown  were  considered 
in  this  study. 

In  a  critique  of  selected  futures  prediction  techniques 
that  might  be  employed  by  educational  planners,  Folk  (1976) 
observed  that  exponential  trend  line  and  arithmetic  straight 
line  projections  appear  to  be  the  most  commonly  used  extra- 
polative  techniques.   This  author  provided  a  number  of  useful 
measures  for  evaluating  statistically  derived  regression 
lines . 

The  educational  applications  just  described  are  basically 
attempts  to  project  inputs  such  as  money,  pupils,  or  teachers 
to  the  educational  system  or  outputs  (graduates,  degrees 
earned)  of  that  system.   No  attempt  to  extrapolate  the  future 
status  of  variables  which  affect  these  student-related  inputs 
or  outputs  was  discovered  in  the  literature  search. 

Extrapolative  Methods  in  Other  Areas 

Several  areas  have  developed  highly  specialized  extrapo- 
lative methods  in  making  forecasts  of  the  future.   Popula- 
tion, employment,  and  unemployment  projections,  for  example, 
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are  usually  based  on  fairly  complex  models  which  incorporate 
a  number  of  factors.   These  particular  applications  are  not 
reviewed  here  due  to  their  highly  specialized  purposes  and 
functions . 

Applicability  of  Reviewed  Extrapolative  Methods 

for  Study 

In  evaluating  the  applicability  of  the  previously  re- 
viewed  extrapolative  methods  for  projected  future  states  of 
the  time  series  indicators  selected  for  use  in  this  study, 
several  points  needed  to  be  considered.   Chief  among  these 
were  (a)  the  underlying  pattern  of  the  data  that  can  be 
recognized   and  (b)  the  type  or  class  of  model  desired  (from 
Wheelwright  f,  Makridakis,  1977).   Both  of  these  will  be 
briefly  considered  in  relation  to  this  study. 

The  Pattern  of  the  Data 

From  graphical  representations  of  each  indicator,  the 
data  for  each  appeared  to  be  characterized  by  a  trend  which 
either  increased  or  decreased  with  time.   Some  also  appeared 
to  contain  cyclical  patterns  and  random  fluctuations.   It 
seemed  as  if  major  trends  might  follow  the  form  of  a  straight 
line  or  curve  with  one  or  two  bends. 

The  Class  of  Model 

Wheelwright  and  Makridakis  (1977)  distinguished  four 
classes  or  categories  of  models: 

1.   The  time  series  model  "always  assumes  that  some 
pattern  of  combination  of  patterns  is  recurring  over  time" 
(p.  22). 
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2.  The  causal  model  assumes  "that  the  value  of  a  cer- 
tain variable  is  a  function  of  several  other  variables"  (p.  23). 

3.  The  statistical  model  comprises  a  number  of  fore- 
casting techniques;  it  uses  the  language  and 

procedures  of  statistical  analysis  to 
identify  patterns  in  the  variables  being 
forecast  and  in  making  statements  about 
the  reliability  of  these  forecasts, 
(p.  23) 

4.  The  nonstatistical  model  includes  "all  models  that 
do  not  follow  the  general  rules  of  statistical  analysis  and 
probability"  (p.  24). 

Of  course,  some  techniques  can  be  classified  into  more  than 
one  of  the  four  types  of  models.   It  appeared  that  the 
statistical  model,  with  its  well-defined  properties,  and 
replicable  procedures,  would  be  an  appropriate  starting 
point  for  predicting  the  long-term  trends  in  the  selected 
time  series  data. 

The  review  of  the  literature  revealed  several  techniques 
denoted  by  the  form  of  their  curves  whicli  are  sensitive  to 
long-term  trends  in  the  data  and  which  are  classified  under 
the  statistical  model:   (a)  the  arithmetic  straight  line, 
(b)  the  S-shaped  growth  or  logistic  curve,  (c)  the  trend  or 
exponential  curve,  (d)  the  polynomial  curve.   All  of  these 
techniques  are  regression  techniques  solved  by  least  squares 
procedures.   Techniques  (b)  through  (d)  require  data  trans- 
formations to  satisfy  the  basic  linear  model  used  in  regres- 
sion.  The  growth  or  logistic  curve  was  eliminated  from  com- 
parison because  this  technique  necessitates  the  setting  of 
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limits  which  might  bias  the  results  of  the  study  due  to  its 
ex  post  facto  nature.   The  remaining  three  techniques  were 
considered  to  be  appropriate  for  use  in  the  comparison  phase 
of  this  study. 

Description  of  Methods  to  be  Compared 

Since  the  three  techniques  selected  for  comparison  are 
intrinsically  linear  in  their  parameters  (Draper  §  Smith, 
1966) ,  the  general  linear  model  denoted  by  the  simple  or 
bivariate  regression  equation  is  presented  first.   Addition- 
ally, estimation  of  the  parameters  of  the  equation  by  least 
squares  procedures,  the  assumptions  of  the  model,  and  criteria 
for  evaluation  and  comparison  of  the  three  methods  are  dis- 
cussed.  Each  technique  is  then  described  in  relation  to  the 
general  linear  model. 

The  General  Linear  Model 

In  the  comparison  of  methods  using  selected  time  series 
indicators,  time  in  years  is  considered  the  independent 
variable  and  the  indicator  is  considered  the  dependent  or 
response  variable.   Thus,  if  time  is  denoted  by  X,  and  the 
indicator  is  denoted  by  Y,  a  functional  relationship  in  the 
form 

Y  =  f(X) 
might  be  stated.   However,  since  most  social  relationships 
are  stochastic  (probabilistic)  rather  than  deterministic  in 
nature,  a  more  appropriate  form  might  be 
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Y  =  f(X)  +  e, 

where  e  represents  error,  a  measure  of  the  unknown  factors. 
When  the  relationship  between  the  two  variables,  time  and 
the  indicator  (Y)  is  assumed  to  be  linear  (that  is,  repre- 
sented by  a  straight  line) ,  the  equation  becomes 

Y  =  Bo  +  BiX; 

and  because  many  social  relationships  are  stochastic  for 
particular  values  of  the  variables,  this  equation  is  actually 

Y  =  Bo  +  BiX  +  e. 

Since  the  population  parameters  Bo  and  Bi  are  not  known  unless 
all  possible  occurrences  of  X  and  Y  are  known,  the  available 
data  are  used  to  provide  estimates  b0  and  bi  of  Bo  and  Bi  as 
in  the  following  regression  equation, 

Y  =  b0  +  biX  +  e 

(where  Y  denotes  predicted  values  of  Y) . 

The  constant  bo  (the  intercept)  and  the  regression  coefficient 

bi  (the  slope  of  the  regression  line)  can  be  determined  by 

ordinary  least  squares  procedure,  "so  called  because  it 

estimates.  .  .in  such  a  way  that  the  sum  of  squared  residuals, 

2 
Ee.  ,  is  as  small  as  possible"  (Mayes  §  Mayes,  1976,  p.  112). 

(For  detailed  treatment  of  simple  regression  and  least  squares 

estimation  of  Bo  and  Bi,  see,  for  example:   Draper  §  Smith, 

1966;  Kerlinger  {j  Pedhazur,  1973;  Mayes  $  Mayes,  1976;  Men- 

denhall,  Ott,  $  Larson,  1974;  Mendenhall  §  Reinmuth,  1971; 

Runyon  $  Haber,  1967.) 
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The  Assumptions  of  the  Linear  Model 

Draper  and  Smith  (1966)  noted  that 

In  many  aspects  of  statistics  it  is 
necessary  to  assume  a  mathematical 
model  to  make  progress.   It  might  be 
well  to  emphasize  that  what  we  are 
usually  doing  is  to  consider  or 
tentatively  entertain  our  model. 
TBT~8] 

Thus,  when  the  general  linear  model  is  employed  as  it  is  in 

this  study,  it  becomes  necessary  to  examine  the  assumptions 

upon  which  the  model  is  based  and  to  judge  whether  the  model 

is  in  fact  appropriate  for  the  data. 

Assumptions  for  the  general  linear  model  include  the 

following : 

1.  The  regression  equation 

Y  =  b0  +  biX  +  e 
is  a  better  predictor  of  Y  than 

Y  =  Y  (bi^  0). 

2.  The  regression  equation  accounts  for  a  significant 
portion  of  the  variation  in  Y,  that  is,  the  relationship 
between  X  and  Y  described  by  the  equation  is  not  the  result 
of  chance. 

3.  The  error  term  e  has  a  mean  value  equal  to  zero  and 
variance  equal  to  a2 ;  it  is  an  independent  random  variable 
which  is  normally  distributed. 

If  the  first  two  assumptions  are  not  met,  then  the  model 
is  not  a  good  predictor  for  that  data.   If  the  third  assump- 
tion is  not  met,  then  it  is  not  appropriate  to  interpret  the 
results  statistically,  that  is,  in  terms  of  the  probability 
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distribution  of  the  random  error  e.   It  is  possible  to  test 
Assumption  1  and  Assumption  2  by  the  F  statistic.   Assumption 
3  is  best  evaluated  by  plotting  the  residuals  and  examining 
the  pattern  of  the  deviations  from  the  regression  line 
(Anscombe,  1973;  Anscombe  §  Tukey,  1963;  Draper  $  Smith, 
1966).   Independence  of  the  errors  (Assumption  3[e])  may  be 
tested  by  the  Durbin-Watson  test  for  serial  correlation 
(Durbin  $  Watson,  1950;  Durbin  §  Watson,  1951;  Mayes  §  Mayes, 
1976;  Wheelwright  §  Makridakis,  1977). 

Criteria  for  Comparison  of  Methods 

The  following  questions  were  derived  from  the  literature 
to  guide  the  comparison  of  methods: 

1.  Do  the  data  satisfy  the  assumptions  of  the  model? 
(See  previous  section.) 

2.  How  well  does  the  regression  line  fit  the  data  from 
which  it  was  derived  (the  two-thirds  of  the  data  points  used 
to  generate  the  prediction  equation)?   Tufte  (1974,  pp.  69-70) 
listed  four  measures  of  quality  of  fit: 

a.   the  N  residuals:  Y.  -  Y. 


b.   the  residual  variation 

;2 

yx 


si!   -  (Yi  "  Yi)! 


N  -  k*  -1 

(or  the  square  root  of  the  residual  variation,  S    ,  called 
the  unbiased  standard  error  of  estimate) . 


*k  refers  to  the  number  of  X  terms  in  the  regression 
equation. 
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c.   the  ratio  of  explained  to  total  variation 
r2  =  E(Y\  -  Y)2 


SCYj  -  Y) 


the  standard  error  of  the  estimate  of  the  slope 
}b, 


c        S 
Sh.  =  Y22L 


A(xi  -  X)2 

Thus,  for  each  set  of  data,  the  methods  are  compared 
according  to  these  four  measures.   The  observed  and  pre- 
dicted values  of  Y  are  also  reported  in  tabular  form;  both 
observed  and  predicted  values  are  plotted  for  visual  com- 
parison as  recommended  by  Anscombe  (1973). 

3.   How  well  does  the  extrapolated  line  fit  the  data 
(the  one-third  of  the  data  points  that  were  not  used  to 
generate  the  prediction  equation)?   The  residual  variation 
around  the  extrapolated  line,  which  is  an  indicator  of  the 
accuracy  of  the  forecasting  technique,  may  be  expressed  by 
its  square  root,  the  standard  error  for  the  extrapolated 
values.   As  in  (2),  the  observed  and  extrapolated  values 
are  reported  in  tabular  form;  both  observed  and  extrapolated 
values  are  plotted  for  visual  comparison. 

Neiswanger  (1956,  p.  534)  cautioned  against  accepting 
only  mathematical  tests  of  "goodness  of  fit"  as  proof  that 
the  mathematical  expression  is  appropriate  for  the  trend  in 
the  data.   Other  considerations  such  as  the  "reasonableness 
of  the  extrapolated  values  which  the  trend  may  yield"  (p.  534) 
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and  "the  extent  to  which  this  statistical  manifestation  of 
growth  is  supported  by  other  evidence"  (p.  534)  should  be  kept 
in  mind.   Thus,  the  calculation  of  a  trend  is  more  than  a 
mathematical  analysis  in  curve  fitting;  it  is  essentially  a 
problem  of  analysis  of  the  phenomena  represented  by  the  data 
(Neiswanger,  1956). 

It  should  be  also  noted  that  while  the  standard  error  of 
estimate  gives  an  overall  measure  of  error  around  the  regres- 
sion line,  it  may  not  be  appropriate  for  computing  confidence 
intervals  for  a  specific  forecast  value.  The  reason  for  this 
is  that  the  further  an  X  is  from  X,  the  larger  is  the  error 
that  may  be  expected  when  predicting  Y  from  the  regression 
line.   Draper  and  Smith  (1966)  noted: 

We  might  expect  to  make  our  "best"  pre- 
dictions in  the  "middle"  of  our  observed 
range  of  X  and  would  expect  our  predic- 
tions to  be  less  good  away  from  the 
"middle."   (p.  22) 

Therefore,  the  confidence  limits  for  the  true  value  of  Y  for 

a  given  X   are  two  curved  lines  about  the  regression  line. 

The  limits  change  as  the  position  of  X  changes.   Hence  the 

following  equation  was  provided  by  Wheelwright  and  Makridakis 

(1977,  p.  82)  for  computing  the  standard  error  of  forecast 

(SE£): 


Z(Y.  -  Y.) 
v  l     iJ 


N  -  k 


for  a  specific  forecast  valu( 


v-»  2 


1  +  -   + 
n 


(Xi  -  X) 


VI  2 


Z(Xi  -  X) 
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Method  1:   Simple  Linear  Regression 
The  equation 
Y  =  f(X) 
describes  a  natural  functional  relationship  between  X  and 
Y.   If  this  functional  relationship  can  be  expressed  by  a 
straight  line  on  arithmetic  paper,  the  linear,  first-order 
regression  equation 

Y=b0+biX+6 
may  be  appropriate.   The  natural  linear  function  is  used  when 
an  absolute  amount  of  change  in  Y  per  unit  of  X  is  hypothe- 
sized. 

Method  2:   Log-linear  Regression 

Occasionally  when  time  series  data  are  plotted  on  an 

arithmetic  scale  the  scatter  of  points  fall  more  in  a  curve 

than  in  a  straight  line  with  the  curve  rising  or  decreasing 

more  rapidly  as  X  increases.   These  same  data  when  plotted 

on  a  semilogarithmic  scale  will  produce  a  straight  line. 

The  relationship  between  X  and  Y  may  then  be  described  by 

log  Y  =  f(X) 
or 

Y  =  abx, 

the  exponential  form  of  the  logarithmic  relationship  between 
X  and  Y. 

The  exponential  function  is  used  when  there  is  thought 
to  be  a  constant  rate  of  change  in  Y  per  unit  absolute  change 
in  X.   Thus,  for  each  year  (X),  Y  changes  by  a  constant  per- 
centage (rather  than  by  an  absolute  amount  as  in  Method  1) . 
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It  is  possible  to  fit  the  exponential  function  to  the 
general  linear  model  by  transforming  the  values  of  Y  to  log 
Y.   Thus 

Y  =  ab   becomes 

log  Y  =  a  +  bX 

log  Y  =  log  a  +  X  log  b 
or 

log  Y  =  log  b0  +  X  log  bi  +  e. 

As  in  the  case  of  the  natural  number  straight  line,  the 

method  of  least  squares  is  used  to  estimate  the  parameters 

necessary  for  computing  the  logarithmic  (or  geometric) 

straight  line.   Tuttle  (1957,  p.  431)  noted,  therefore,  that 

the  log  Y's  are  fitted  to  the  log  Y's,  not  the  Y  to  the  Y's, 

by  the  least  squares  criterion.   Thus,  Tuttle  (1957,  p.  432) 

recommended  that  the  standard  error  of  estimate  be  computed 

from  the  antilogs  of  the  log  Y  values.   If  S  #   was  computed 

y  •  x 

as  the  root  mean  square  of  the  unexplained  variation,  "it 
would  be  in  terms  of  the  deviations  of  the  logarithms  of  the 
Ycs[Y's]  from  the  logarithms  of  the  Y's"  (Tuttle,  1957,  p. 
432).   The  S    would  not  be  comparable  to  those  obtained 
from  untransformed  data  as  in  Method  1. 

Similarly,  Seidman  (1976)  has  observed  that  in  comparing 
linear  and  log-linear  models,  R2  may  not  be  a  sufficient 
criterion  of  choice.   This  is  because  the  R2  represents  "the 
proportion  of  variance  of  the  logarithm  of  Y  explained  by 
the  regression:   log  Y,  not  Y,  is  the  dependent  variable" 
(Seidman,  1976,  p.  463).   Therefore,  Seidman  recommended 
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using  the  antilogs  of  the  predicted  values  of  log  Y  "in  a 
regression  explaining  variability  in  Y"  (p.  463).   This  R2 
may  then  be  used  for  comparison  purposes.   The  examples  given 
by  Seidman  (1976)  were  based  on  logarithmic  transformations  of 
both  dependent  and  independent  variables,  but  the  same  ob- 
servation may  be  made  when  only  the  dependent  variable  is 
transformed.   Seidman 's  reservation  about  R2  has  been  con- 
sidered in  this  study. 

If  an  exponential  curve  appears  to  fit  the  data,  it  is 
often  desirable  to  find  the  annual  rate  of  change  c.   This 
can  be  derived  from  the  regression  coefficient  bi ,  according 
to  the  following  equation: 
log  bi  =  (1  +  c) 
change  =  antilog  bi  -  1. 
The  result  should  then  be  expressed  as  a  percent  (Mayes  $ 
Mayes,  1976,  p.  94;  Nie  et  al.,  1975,  p.  370). 

The  common  or  Briggs  logarithm,  used  in  the  Y  trans- 
formation in  this  study,  is  the  power  to  which  10  must  be 
raised  to  equal  the  number  (see  Neiswanger,  1956,  p.  210; 
Tufte,  1974,  p.  108).  Natural  logs  or  logs  to  the  base  2 
could  also  have  been  used  to  obtain  the  same  results 
(Snedecor,  1956,  pp.  450-451). 

Method  3:   Polynomial  Regression 

In  Method  1,  the  equation  which  expresses  a  straight 
line  relationship  between  X  and  Y  is 
Y=b0+biX+e 
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which  is  a  linear  (in  the  b's)  first-order  (in  X)  regression 
equation.   When  this  functional  relationship  between  X  and 
Y  can  be  expressed  as  a  solid,  or  unbroken  curve  on  arithmetic 
paper,  the  linear,  second-order  (or  quadratic)  regression 
equation 

Y  =  b0  +  b,X  +  b2X2  +  e 

may  be  appropriate.   When  the  relationship  can  be  expressed 
as  a  curved  line  with  two  bends  on  arithmetic  paper,  the 
linear,  third-order  (or  cubic)  regression  equation 

Y  =  b0  +  biX  +  b2X2  +  b3X3  +  § 
may  be  used. 

According  to  Kerlinger  and  Pedhazur  (1973,  p.  209),  the 
highest  order  a  polynomial  equation  may  take  is  equal  to  N  -  1, 
where  N  is  the  number  of  distinct  values  in  the  independent 
variable.   However,  since  one  of  the  goals  of  scientific 
research  is  parsimony, 


our  interest  is  not  in  the  predictive 
power  of  the  highest  degree  polynomial 
equation  possible,  but  rather  in  the 
highest  degree  polynomial  equation 
necessary  to  describe  a  set  of  data. 
(Kerlinger  §  Pedhazur,  1973,  p.  209) 

Another  reason  for  a  parsimonious  approach  to  polynomial 

curve  fitting  is  that  for  each  order  added  to  the  equation, 

a  degree  of  freedom  is  lost.   This  is  especially  important 

when  the  number  of  observations  are  small  as  they  are  in  this 

study  (observations  range  from  8  to  20  in  each  of  the  eight 

sets  of  data).   Also,  higher  order  polynomial  curves  may 

possess  statistical  significance  but  be  devoid  of  practical 
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significance.   Accordingly,  only  the  quadratic  and  cubic 
forms  of  the  polynomial  regression  equation  are  considered. 
In  the  polynomial  regression  the  independent  variable, 
X  (time) ,  is  treated  as  a  categorical  variable  and  is  raised 
to  a  certain  power.   In  the  quadratic  equation,  each  value 
of  X  is  squared  to  create  a  new  vector  of  the  squared  X's, 
X2 .   Similarly,  in  the  cubic  equation,  each  value  of  X  is 
cubed  to  create  an  additional  vector  of  the  cubed  X's,  X3. 
Thus,  the  resulting  equation  can  be  solved  by  a  stepwise 
multiple  regression  procedure,  in  which  at  each  step  of  the 
analysis,  the  R2  is  tested  to  see  if  the  higher-degree  poly- 
nomial accounts  for  a  significant  proportion  of  the  variance. 
While  a  least  squares  solution  is  used  in  this  study,  the 
values  of  the  unknowns  may  also  be  found  by  orthogonal  poly- 
nomials (see  Draper  $  Smith,  1966,  pp.  150-155;  Greenwald, 
1963,  pp.  204-209;  Kerlinger  §  Pedhazur,  1973,  pp.  214-216). 

Neiswanger  (1956,  pp.  529-532)  noted  that  the  second- 
degree  and  third-degree  parabolas  provide  greater  flexi- 
bility in  fitting  a  line  to  a  set  of  data  for  the  parabolas 

How  a  trend  to  change  direction.   Whether  or  not  the 
flexibility  of  the  parabolic  function  enhances  the  predic- 
tability of  extrapolated  Y  values,  however,  is  not  certain 
and  is  examined  in  this  study. 
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CHAPTER  IV 

COMPARISON  OF  EXTRAPOLATIVE  METHODS  USING 
SELECTED  TIME  SERIES  INDICATORS 

In  Chapter  II  a  rationale  for  the  selection  of  social 
variables  operationally  defined  as  time  series  indicators 
was  provided.   The  following  eight  time  series  indicators 
were  selected  for  use  in  the  method  comparison  phase  of  this 
study: 

1.  Median  family  income  in  the  United  States  expressed 
as  1971  constant  dollars. 

2.  Number  of  families  in  the  United  States  headed  by 
women  expressed  as  a  percentage  of  total  families. 

3.  Number  of  wives  in  the  labor  force  expressed  as 
a  percentage  of  total  wives  in  the  United  States. 

4.  Number  of  marriages  in  Florida  expressed  as  rate 
per  1,000  population  in  Florida. 

5.  Number  of  divorces  in  Florida  expressed  as  rate 
per  1,000  population  in  Florida. 

6.  Number  of  resident  live  births  in  Florida  expressed 
as  rate  per  1,000  population  in  Florida. 

7.  Number  of  3  to  5  year  olds  enrolled  in  nursery 
school  and  kindergarten  expressed  as  percentage  of  total 
children  3  to  5  years  old  in  the  United  States. 

8.  Number  of  children  involved  in  divorce  or  annulment 
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expressed  as  rate  per  1,000  children  under  18  years  old  in 
the  United  States. 

A  rationale  for  the  three  extrapolative  methods  selected 
for  comparison  in  this  study  was  presented  in  Chapter  III. 
The  three  methods  are  simple  linear  regression,  log-linear 
regression,  and  polynomial  regression  (specifically  the 
quadratic  and  cubic  forms). 

The  following  questions  derived  from  the  literature  were 
proposed  in  Chapter  III  to  guide  the  comparison  of  methods: 

1.  Do  the  data  satisfy  the  assumptions  of  the  general 
linear  model? 

2.  How  well  does  the  regression  line  fit  the  data  from 
which  it  was  derived  (the  two-thirds  of  the  data  points  used 
to  generate  the  prediction  equation)? 

3.  How  well  does  the  extrapolated  line  fit  the  data 
(the  one-third  of  the  data  points  that  were  not  used  to 
generate  the  prediction  equation)? 

To  answer  these  questions  in  terms  of  each  method  and 
to  facilitate  comparison  among  the  three  methods,  the  results 
obtained  from  applying  each  of  the  methods  to  each  of  the 
eight  indicator  data  sets  are  presented  in  the  following 
manner: 

1.   The  fit  of  the  regression  line  to  the  observed  data 
is  indicated  by  r2  and  the  unbiased  standard  error  of 
estimate  S    .   For  the  simple  linear  and  log-linear  regres- 
sion methods  the  amount  of  variance  accounted  for  by  the 
regression  line  is  tested  by  the  F  statistic  (F  value  is  the 
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same  as  that  obtained  by  dividing  bi  bv  SE,  ).   For  the  quad- 

ratic  and  cubic  forms  of  polynomial  regression,  both  the  r2 

including  all  orders  entered  to  that  step  (r2  -,  0  or  r2  ,  -,) 

r  v  y. 12      y . 1237 

and  the  increase  in  r2  attributable  to  the  last  order  entered 

in  the  regression  U2^  j)  or  ryf3  12)-*  are  tested  with  the 
F  statistic*   (Of  course,  dividing  the  partial  regression 
coefficients  b2  in  the  quadratic  form  and  b3  in  the  cubic 
form  by  their  respective  standard  errors  will  also  yield  the 
same  F  value  for  the  increase  in  r2.) 

2.   The  fit  of  the  extrapolated  line  to  the  data  is 
indicated  numerically  by  the  standard  error  for  the  extra- 
polated values, .  S    r    ..   This  measure  reflects  the 

t?  A.  L  ^  V   -A.  J 

average  deviation  of  the  extrapolated  values  from  the  ob- 
served values  of  Y. :  thus, 

i  '      ' 


*  ■     I ^---Y) 


2 

'ext(y-x)      \J  N  (of  extrapolated  values) 


(Note  that  this  equation  is  not  the  "unbiased"  form  used  in 
computing  S    .) 

f  &     y.X  > 


*Actually  the  increase  in  r2  is  tested  according  to  the 
following  ratio: 


p  _  (r2  with  k1  -order  term)  -  (r2  without  kth-order  term) 

r  -  — j- — — — —' 

(1  -  r2  with  k   -order  term)  /  (N  -  k  -  1) 


Total  r2  is  tested  according  to  the  following  ratio 

c  _   SS  regrcssion/k 

SS  residual/ (N  -  k  -  1J 
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3.  All  observed  and  predicted  values  of  Y  are  reported 
in  tabular  and  graphic  form. 

4.  The  residuals  around  the  regression  line  were  ex- 
amined for  serial  correlation  by  the  Durbin-Watson  d  statistic, 
which  is  noted  only  when  serial  correlation  is  confirmed  or 
questionable.   Additionally,  the  standardized  residuals 

were  plotted  against  the  sequence  of  cases  and  also  against 
standardized  Y  values.   Such  visual  inspection  of  the  data  is 
discussed  as  necessary  to  support  the  interpretation  of  re- 
sults in  Chapter  V. 

Presentation  of  Results 

Indicator  1 

The  mean  and  standard  deviation  for  the  Y  values  used 
to  generate  the  regression  equations  are  6674  and  987,  re- 
spectively.  The  following  regression  equations  were  used  to 
derive  Y: 

Linear   Y  =  -3933.29  +  192.87  X 
Quadratic   Y  =  106.37  +  44.79  X  +  1.35  X2 

Cubic   Y  =  135025.33  +  (-7385. 17)X  +  137.08  X2  +  (-.82)X3 
Log-linear   log  Y  =  3.12346  +  .01266  X. 
The  goodness  of  fit  of  the  regression  lines  derived  from  these 
equations  is  indicated  by  r2 ,  r2  change,  and  S  _   in  Table  2. 
An  ANOVA  summary  table  is  presented  in  Table  18  in  the  Appen- 
dix.  The  overall  F's  for  all  methods  are  significant  (p_<.01); 
the  increases  in  r2  due  to  the  higher  order  polynomials  are 
not  significant,  however. 
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Table  2 
Indicator  1:   Summary  Statistics  for 
Prediction  Equations  by  Method 


r2        r2  change  F  df        ^yx 


Linear 
.97440  571.04**       1,15       163.02 

Quadratic3 

.97531  276.47**       2,14       165.75 

.00090  .51        1,14 

Cubic5 

.98137  228.25**       3,13       149.40 

.00606  .42        1,13 

Log-linear 

log  Y 

.97021  ..  488.54**       1,15    log. 01157 

antilog  Y 

.98726  164.92 

Note .   Indicator  1  is  median  family  income  expressed  in 
1971  constant  dollars. 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

Both  r2  and  S    have  been  recomputed  using  antilogs  of 
•v        y  *  x  s\ 

the  log  Y;  much  of  the  difference  between  r2(log  Y)  and  r2 

(antilog  Y)  may  be  due  to  rounding. 
**p_<.01 
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The  average  errors  for  the  extrapolated  Y  values 

(sext(v.x")  -*  according  t0  method  employed  are  (a)  linear,  669; 

(b)  quadratic,  487;  (c)  cubic,  1981;  and  (d)  log-linear,  283. 
Observed  Y's  and  predicted  values  of  Y  for  both  the  original 
regression  and  the  extrapolated  lines  are  presented  in  Table 
3.   These  data  are  graphically  presented  in  Figure  2. 

Thus,  there  is  very  little  difference  in  the  total  r2 
for  the  methods;  the  quadratic  and  cubic  forms  added  little 
to  the  r2  already  provided  by  the  linear  component.   The  S 
for  the  cubic  form  is  smaller  (149.40)  than  for  the  other 
methods. 

When  the  lines  are  extrapolated  beyond  the  original 
values,  however,  the  cubic  form  is  clearly  the  "worst"  fit 
with  a  sext('y.x)  of  1081  and  the  log-linear  method  the  "best" 
with  a  sext-rv.x-\  °f  283.   Whether  the  exponential  curve  would 
continue  to  be  a  superior  predictor  is  a  matter  of  conjecture. 


Indicator  2 

The  mean  and  standard  deviation  for  the  Y  values  used 
to  generate  the  regression  equations  are  10.4  and  .77,  re- 
spectively.  The  following  regression  equations  were  used  to 
derive  Y: 

Linear   Y  =  9.87860  +  .02789  X 

Quadratic   Y  =  11.08794  +  (-.19035)X  +  .00626  X2 

Cubic   Y  =  11.45737  +  (-.35750)X  +  .01954  X2  +  (-.0027)X3 

Log-linear   log  Y  =  .99387  +  .00118  X. 

The  goodness  of  fit  of  the  regression  lines  derived  from 

these  equations  is  indicated  by  r2,  r2  change,  and  S    in 

y  •  x 
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Table  3 
Indicator  1:   Observed  Y's  and  Predicted  Y's  by  Method 


Observed  Y's 

Predicted  Y 

1 s  by  Me 

thod 

Year 

Linear 

Quadratic3 

Cubic 

Log-linear 

Original  regression 

1947 

5,483 

5,131c 

5,185 

5,323 

5,231 

1948 

5,367 

5,324 

5,358 

5,392 

5,386 

1949 

5,278 

5,517 

5,533 

5,499 

5,545 

1950 

5,594 

5,710 

5,711 

5,637 

5,709 

1951 

5,783 

5,903 

5,892 

5,803 

5,878 

1952 

5,939 

6,096 

6,076 

5,992 

6,052 

1953 

6,433 

6,289 

6,262 

6,197 

6,231 

1954 

6,288 

6,481 

6,450 

6,416 

6,415 

1955 

6,693 

6,674 

6,642 

6,642 

6,605 

1956 

7,122 

6,867 

6,836 

6,871 

6,800 

1957 

7,138 

7,060 

7,033 

7,097 

7,002 

1958 

7,126 

7,253 

7,233 

7,317 

7,209 

1959 

7,524 

7,446 

7,435 

7,524 

7,422 

1960 

7,688 

7,639 

7,640 

7,714 

7,642 

1961 

7,765 

7,831 

7,848 

7,882 

7,868 

1962 

7,975 

8,024 

8,058 

8,023 

8,100 

1963 

8,267 

8,217 

8,271 

8,133 

8,340 

Extrapolation 

1964 

8,579 

8,410 

8,487 

8,205 

8,584 

1965 

8,932 

8,603 

8,705 

8,235 

8,838 

1966 

9,360 

8,796 

8,926 

8,220 

9,100 

1967 

9,683 

3,989 

9,150 

8,152 

9,369 

1968 

10,049 

9,182 

9,377 

8,028 

9,646 

1969 

10,423 

9,374 

9,606 

7,843 

9,931 

1970 

10,289 

9,567 

9,838 

7,591 

10,225 

1971 

10,285 

9,760 

10,072 

7,268 

10,527 

Note.   Indicator 

1  is  median 

i  family  income  expressed  in 

1971  constant  dollars 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points . 

Predicted  Y's  in  terms  of  1971  constant  dollars  are  rounded 
to  number  of  places  in  original  data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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Figure  2. 


Indicator  1:   Observed  Y's  and  predicted  Y's  bv 
method  (vertical  line  separates  values  of  original 
regression  from  extrapolation). 
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Table  4.   An  ANOVA  summary  table  is  presented  in  Table  19  in 
the  Appendix.   The  overall  F's  for  the  quadratic  and  cubic 
forms  of  the  polynomial  regression  are  significant  (p_<.05); 
however,  only  the  increase  in  r2  due  to  the  quadratic  is 
significant  (p_<.01). 

The  average  errors  for  the  extrapolated  Y  values 
^extfy-x)-'  according  t0  the  method  employed  are  (a)  linear, 
1.5;  (b)  quadratic,  .35;  (c)  cubic,  1.1;  and  (d)  log-linear, 
1.5.   Observed  Y's  and  predicted  values  of  Y  for  both  the 
original  regression  and  the  extrapolated  lines  are  presented 
by  method  in  Table  5.   These  data  are  graphically  presented 
in  Figure  3. 

Thus,  the  cubic  form  of  the  polynomial  accounts  for  the 

most  variance  in  Y  (89%)  and  has  the  smallest  S     (.33). 

y-x  *•    ' 

The  quadratic  form,  accounting  for  811  of  the  variance  in  Y, 
has  a  S    of  .39;  visual  inspection  of  the  second-degree 
curve  reveals  that  this  curve  may,  in  fact,  more  closely 
fit  observed  values  for  the  latter  portion  of  the  regression 
line  than  the  cubic  form.   The  linear  and  log-linear  methods 
provide  no  better  estimate  of  Y  than  does  Y;  indeed,  the 
standard  error  of  estimate  approximates  the  standard  devia- 
tion of  the  observed  Y's. 

When  the  lines  are  extended  beyond  the  original  values, 

the  quadratic  provides  the  superior  fit  (S   . ,    *  ■  .35); 

r  v  ext (y-x)       '  ' 

the  fit  of  the  cubic,  linear,  and  log-linear  methods  to  the 
observed  values  is  poor  with  residuals  becoming  larger  for 
successive  years. 
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Table  4 
Indicator  2:   Summary  Statistics  for 
Prediction  Equations  by  Method 


r2        r2  change  F  df        S 

fa  y  •  x 

Linear 
16405  1.18         1,6         .76 


81417  10.95*        2,5         .39 

.65012 


Quadratic 

10.95* 

2,5 

17.49** 

1,5 

Cubic 

11.08* 

3,4 

2.92 

1,4 

.89255  11.08*        3,4         .33 

.07838 

b 
Log-  linear 

log  Y 

.16905  1.22         1,6     log. 03175 

antilog  Y 

.16307  .78 

Note .  Indicator  2  is  number  of  families  in  the  United 
States  headed  by  women  expressed  as  a  percentage  of  total 
famil ies . 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

Both  r2  and  S    have  been  recomputed  using  antilogs  of 
/\         y  •  x  zv 

the  log  Y;  much  of  the  difference  between  r2 (log  Y)  and  r2 

(antilog  Y)  may  be  due  to  rounding. 
*  p<.05 
**p_<.01 
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Table  5 
Indicator  2:   Observed  Y's  and  Predicted  Y's  by  Method 


Observed 

Predicted  Y 

's  by  Method 

Year(X) 

Y's   Linear   Quadratic 

Cubic 

Log-  linear 

Original  regression 

1940(1) 

11.2 

9.9C     10.9 

11.1 

9.9 

1947(8) 

9.5 

10.1      10.0 

9.7 

10.1 

1950(11) 

9.4 

10.2        9.7 

9.5 

10.2 

1955(16) 

10.1 

10.3       9.6 

9.6 

10.3 

1960(21) 

10.0 

10.5       9.9 

10.1 

10.4 

1965(26) 

10.5 

10.6       10.4 

10.6 

10.6 

1970(31) 

10.9 

10.7       11.2 

11.1 

10.7 

1971(32) 

11.5 

10.8       11.4 
Extrapolation 

11.2 

10.8 

1972(33) 

11.6 

10.8      11.6 

11.2 

10.8 

1973(34) 

12.1 

10.8       11.9 

11.3 

10.8 

1974(35) 

12.4 

10.9       12.1 

11.3 

10.8 

1975(36) 

13.0 

10.9       12.4 

11.3 

10.9 

Note. 

Indicator 

2  is  number  of  families 

in  the 

United 

States  headed  by  women  expressed  as  a  percentage  of  total 
families. 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points . 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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Indicator  3 

The  mean  and  standard  deviation  for  the  Y  values  used  to 
generate  the  regression  equations  are  32.0  and  3.8,  respec- 
tively.  The  following  regression  equations  were  used  to  de- 
rive  Y: 

Linear   Y  =  23.27325  +  .74603  X 
Quadratic   Y  =  23.38774  +  .71893  X  +  .00126  X2 
Cubic   Y  ■  22.60710  +  1.18842  X  +  (-.05693)X2  +  .00194  X3 
Log-linear   log  Y  =  1.37971  +  .01047  X. 
The  goodness  of  fit  of  the  regression  lines  derived  from  these 

equations  is  indicated  by  r2 ,  r2  change,  and  S    in  Table  6. 

a  y  •  x 

An  ANOVA  summary  table  is  presented  in  Table  20  in  the  Appen- 
dix.  The  overall  F's  for  all  methods  are  significant  (p_<.01); 
however,  the  increases  in  r2  due  to  the  higher  order  polyno- 
mials are  not  significant. 

The  average  errors  for  the  extrapolated  Y  values 
*-Sext  (y-x)  ^  according  to  the  method  employed  are  (a)  linear, 
1.4;  (b)  quadratic,  1.2;  (c)  cubic,  2.7;  and  (d)  log-linear, 
.6.   Observed  Y's  and  predicted  values  of  Y  for  both  the 
original  regression  and  the  extrapolated  lines  are  presented 
by  method  in  Table  7.   These  data  are  graphically  repre- 
sented in  Figure  4. 

Thus,  there  is  very  little  difference  in  the  total  r2 
for  the  methods;  the  quadratic  and  cubic  forms  added  an  in- 
significant amount  to  the  r2  already  provided  by  the  linear 
component.   The  S    for  the  cubic  form  is  only  slightly 
better  (.43)  than  for  the  other  methods  (.49-. 54). 
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Table  6 
Indicator  3:   Summary  Statistics  for 
Prediction  Equations  by  Method 


r2        r2  change  F  df        S 


Quadratic 


a 


y-x 


Linear 
98451  826.05**       1,13         .49 


.98460  383.49**       2,12         .50 

.00009  .07        1,12 

Cubica 

•98953  346.51**       3,11         .43 

.00493  5.18         1,11 

Log-linear 

log  Y 

•98085  .  665.69**       1,13     log. 00760 

ant i log  Y 

.99999  .54 

Note.   Indicator  3  is  the  number  of  wives  in  the  labor 
force  expressed  as  a  percentage  of  total  wives  in  the  United 
States . 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

Both  r2  and  S     have  been  recomputed  using  antilogs  of 
y  •  x  r  °   ^     to 

the  log  Y;  much  of  the  difference  between  r2(log  Y)  and  r2 

(anitlog  Y)  may  be  due  to  rounding. 
**p_<.01. 
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Table  7 
Indicator  3:   Observed  Y's  and  Predicted  Y's  by  Method 


Observed 

Predicted  Y 

'  s  bv  Method 

Year(X) 

Y's   Linear   Quadratic 

Cubic 

Log- linear 

Original  regression 

1950(1) 

23.8 

24. 0C      24.1 

23.7 

24.6 

195  5(6) 

27.7 

27.7       27.8 

28.1 

27.7 

1956(7) 

29.0 

28.5      28.5 

28.8 

28.4 

1957(8) 

29.6 

29. 2       29.2 

29.5 

29.1 

1958(9) 

30.2 

30.0       30.0 

30.1 

29.8 

1959(10) 

30.9 

30.7       30.7 

30.7 

30.5 

1960(11) 

30.5 

31.5       31.4 

31.4 

31.3 

1961(12) 

32.7 

32.2       32.2 

32.0 

32.0 

1962(13) 

32.7 

33.0       33.0 

32.7 

32.8 

1963(14) 

33.7 

33.7       33.7 

33.4 

33.6 

1964(15) 

34.4 

34.5       34.5 

34.  2 

34.4 

1965(16) 

34.7 

35.2       35.2 

35.0 

35.3 

1966(17) 

3  5.4 

36.0       36.0 

35.9 

36.1 

1967(18) 

36.8 

36.7       36.7 

36.9 

37.0 

1968(19) 

38.3 

37.4       37.5 
Extrapolation 

38.0 

37.9 

1969(20) 

59.6 

38.2       38.3 

39.1 

38.8 

1970(21) 

40.8 

38.9       39.0 

40.4 

39.8 

1971(22) 

40.8 

39.7       39.8 

41.9 

40.7 

1972(23) 

41.5 

40.4       40.6 

43.4 

41.7 

1973(24) 

42.2 

41.2       41.4 

45.2 

42.8 

1974(25) 

43.0 

41.9       42.1 

47.0 

43.8 

1975(26) 

44.4 

42.7       42.9 

49.1 

44.9 

Note. 

Indicator 

3  is  the  number  of  wives 

;  in  the 

labor 

force  expressed  as  a  percentage  of  total  wives  in  the  United 
States . 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points . 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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47.0- 
45.0- 
43.0 
41. OH 


Observed  Y's 

Cubic 

Linear 

Quadratic 

Log-linear 


Figure  4.   Indicator  3:   Observed  Y's  and  predicted  Y's  by- 
method  (vertical  line  separates  values  of 
original  regression  from  extrapolation). 
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When  the  lines  are  extrapolated  beyond  the  original 
values,  however,  the  cubic  form  is  clearly  the  "worst"  fit 

with  a  S    r    .  of  2.7  while  the  log-linear  form  is  clearly 

exi  (.y  •  aj  ' 

the  "best"  fit  with  a  S   . ,    .of  .6. 

ext (y -x) 

Indicator  4 

The  mean  and  standard  deviation  for  the  Y  values  used 
to  generate  the  regression  equations  are  9.8  and  2.6,  re- 
spectively.  The  following  regression  equations  were  used  to 
derive  Y: 

Linear   Y  =  13.76707  +  (-.13216)X 
Quadratic   Y  =  14.16426  +  (-.19720)X  +  .00148  X2 
Cubic   Y  =10.90554  +  1.15549  X  +  (-.07961)X2  +  .00125  X3 
Log-linear   log  Y  =  1.12873  +  (-.00496)X. 
The  goodness  of  fit  of  the  regression  lines  derived  from  these 

equations  is  indicated  by  r2 ,  r2  change,  and  S    in  Table  8. 

&  y  y  .  x 

An  ANOVA  summary  table  is  presented  in  Table  21  in  the  Appen- 
dix.  The  overall  F  statistic  for  the  cubic  form  of  the  poly- 
nomial regression  is  significant  (p_<.01);  the  increase  in  r2 
due  to  the  third  degree  polynomial  is  also  significant  (p<.01). 
The  F  statistic  for  both  the  linear  and  log-linear  methods 
is  significant  (p_<.05). 

The  average  errors  for  the  extrapolated  Y  values 
^■Sext(y.x)^  according  to  method  employed  are  (a)  linear,  2.9; 
(b)  quadratic,  2.5;  (c)  cubic,  4.1;  and  (d)  log-linear,  2.7. 
Observed  Y's  and  predicted  values  of  Y  for  both  the  original 
regression  and  the  extrapolated  lines  are  presented  by  method 
in  Table  9.   These  data  are  graphically  represented  in  Figure 
5. 
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Table  8 
Indicator  4:   Summary  Statistics  for 
Prediction  Equations  by  Method 


r2        r2  change          F  df        S 

b  y-x 

Linear 

42085                       7.27*  1,10       2.07 


42592  3.34         2,9        2.17 

.00507 


Quadratic3 

3.34 

2,9 

.08 

1,9 

Cubic3 

24.93** 

3,8 

39.53** 

1,8 

Log-linear 

90338  24.93**       3,8  .94 

.47746 


log  Y 

•42298  .  7.33*       1,10     log. 07715 

antilog  Y 
. 36243/. 42922c  2.06 


Note.   Indicator  4  is  number  of  marriages  in  Florida 
expressed  as  rate  per  1,000  population  in  Florida. 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

b      2 
Bother   and  S    have  been  recomputed  using  antilogs  of 

the  log  Yj;  much  of  the  difference  between  r2(log  Y)  and  r2 

(antilog  Y)  may  be  due  to  rounding. 

c  ^ 

Two  methods  of  computing  r2  using  antilogs  of  Y  yielded 

different  results. 

*  p<.05 

**p_<.01 
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Table  9 
Indicator  4:   Observed  Y's  and  Predicted  Y's  by  Method 


Predicted  Y's  by  Method 


Year(X)    Observed  Y's   Linear   Quadratic   Cubic    Log-linear 


Original  regression 

1930(1)  11.6  13. 6C  14.0  12.0  13.3 

1940(11)  17.1  12.3  12.2  15.7  11.9 

1950(21)  9.8  11.0  10.7  11.7  10.6 

1960(31)  7.9  9.7  9.5  7.6  9.4 

1963(34)  7.7  9.3  9.2  7.5  9.1 

1964(35)  7.7  9.1  9.1  7.6  9.0 

1965(36)  8.3  9.0  9.0  7.9  8.9 

1966(37)  8.5  8.9  8.9  8.2  8.8 

1967(38)  9.0  8.7  8.8  8.7  8.7 

1968(39)  9.6  8.6  8.7  9.3  8.6 

1969(40)  9.8  8.5  8.6  10.0  8.5 

1970(41)  10.1  8.3  8.6  10.9  8.4 

Extrapolation 

1971(42)  10.5  8.2  8.5  11.6  8.3 

1972(43)  11.0  8.1  8.4  12.8  8.2 

1973(44)  11.4  8.0  8.4  14.1  8.1 

1974(45)  11.0  7.8  8.3  15.6  8.0 

1975(46)  10.1  7.7  8.2  17.3  8.0 


Note.   Indicator  4  is  number  of  marriages  in  Florida  ex- 
pressed  as  rate  per  1,000  population. 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points. 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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18.0 


17.0- 


Observed  Y's 

Cubic 

Linear 

Quadratic 

LcMT-linear 


Figure  5.   Indicator  4:   Observed  Y's  and  predicted  Y's 
method  (vertical  line  separates  values  of 
original  regression  from  extrapolation). 


by 
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Thus,  it  would  appear  that  for  the  original  regression 
the  cubic  form  of  the  polynomial  best  fits  the  observed  Y's. 
This  method  accounts  for  90%  of  the  variance  in  Y  with  less 
than  half  of  the  average  error  of  the  other  methods. 

When  the  lines  are  extrapolated  beyond  the  original 
values,  however,  the  cubic  form  has  the  largest  average 
error  (sext(-y.x)  =  4.1).   The  quadratic  form  of  the  polyno- 
mial is,  in  fact,  the  best  predictor  (S    ,        -,  =  2.5)  of  the 
methods  compared.   Actually,  for  this  set  of  data  the  mean 
(9.8)  of  the  observed  values  of  Y  used  in  the  original  re- 
gression would  have  been  the  best  predictor  of  the  future 
values  of  Y. 

Indicator  5 

The  mean  and  standard  deviation  for  the  Y  values  used  to 
generate  the  regression  equations  are  4.6  and  1.0,  respec- 
tively.  The  following  regression  equations  were  used  to 
derive  Y: 

Linear   Y  =  4.10332  +  .01692  X 

Quadratic   Y  =  3.16091  +  .17126  X  +  (-.00350)X2 

Cubic   Y  =  1.59416  +  .S2162  X  +  (-.04249)X2  +  .00060  X3 

Log-linear   log  Y  =  .56596  +  .00288  X. 

The  goodness  of  fit  of  the  regression  lines  derived  from 

these  equations  is  indicated  by  r2 ,  r2  change,  and  S    in 

ft  >       y  •  x 

Table  10.  An  ANOVA  summary  table  is  presented  in  Table  22 
in  the  Appendix.  Only  the  overall  F  for  the  cubic  form  of 
the  polynomial  regression  is  significant  (p_<.05);  the  F 
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Table  10 

Indicator  5:   Summary  Statistics  for 

Prediction  Equations  by  Method 


r   change 


df 


y  •  x 


04360 


Linear 


46 


1,10 


1.06 


22401 


18041 


Quadratic 

1.30 

2,9 

2.09 

1,9 

1.00 


92134 


69733 


Cubic 


a 


31.23** 
70.92** 


3,8 
1,8 


34 


log  Y 

.12007  „ 
antilog  Y 

.12779 


Log- linear 
1.36 


1,10     log. 10388 
1.07 


Note.   Indicator  5  is  number  of  dissolutions  of  marriage 
in  Florida  expressed  as  rate  per  1,000  population. 

•3 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

Both  r2  and  S     have  been  recomputed  using  antilogs  of 

the  log  Y;  much  of  the  difference  between  r2(log  Y)  and  r2 

(antilog  Y)  may  be  due  to  rounding. 
**p_<.01 
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value  for  the  increase  in  r2  for  the  cubic  component  is  also 
significant  (p_<.01). 

The  average  errors  for  the  extrapolated  Y  values 
(Sexw   x-j)  according  to  the  method  employed  are  (a)  linear, 
2.2;  (b)  quadratic,  3.1;  (c)  cubic,  .5;  and  (d)  log-linear, 
2.1.   Observed  Y's  and  predicted  values  of  Y  for  both  the 
original  regression  and  the  extrapolated  lines  are  presented 
by  method  in  Table  11.   These  data  are  graphically  repre- 
sented in  Figure  6. 

For  this  set  of  data,  the  cubic  form  of  the  polynomial 

regression  is  a  superior  predictor  of  the  observed  Y's.   This 

method  accounts  for  92°s  of  the  variance  in  Y  with  a  S    of 

y-x 

.34;  the  sext(-y#x)  is  .5,  considerably  less  than  the  other 
three  methods  with  average  error  ranging  from  2.1  to  3.1. 
The  quadratic  form  is  definitely  the  least  appropriate  method 
for  this  set  of  data  since  the  curve  bends  in  an  opposite 
direction  to  the  observed  Y  values  (see  Figure  6) . 

Indicator  6 

The  mean  for  the  Y  values  used  to  generate  the  regres- 
sion equations  is  16.9;  the  standard  deviation,  1.3.   The 
following  regression  equations  were  used  to  derive  Y: 

Linear   Y  =  18.56785  +  (-.36786)X 

Quadratic   Y  =  21.30892  +  (-2.01250)X  +  (.18274)X2 

Cubic   Y  =  22.82142  +  (-3.58611)X  +  .59524  X2  +  (-.03056)X3 

Log-linear   log  Y  =  1.26761  +  (-.00900)X. 
The  goodness  of  fit  of  the  regression  lines  derived  from 
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Table  11 
Indicator  5:   Observed  Y's  and  Predicted  Y's  by  Method 


Observed 

Predicted  Y 

1  s  by 

Method 

Year(X) 

3 

Y's   Linear   Quadratic 

Cubic 

a 

Log-linear 

Original  regression 

1930(1) 

2.5 

4.1c      3.3 

2.4 

3.7 

1940(11) 

5.8 

4.3       4.6 

6.3 

4.0 

1950(21) 

6.4 

4.5        5.2 

5.7 

4.2 

1960(31) 

3.9 

4.6        5.1 

4.2 

4.5 

1963(34) 

4.1 

4.7        4.9 

4.1 

4.6 

1964(35) 

4.1 

4.7       4.9 

4.2 

4.6 

1965(36) 

4.2 

4.7        4.8 

4.3 

4.7 

1966(37) 

4.2 

4.7       4.7 

4.4 

4.7 

1967(38) 

4.6 

4.7       4.6 

4.6 

4.7 

1968(39) 

4.9 

4.8        4.5 

4.8 

4.8 

1969(40) 

5.2 

4.8       4.4 

5.1 

4.8 

1970(41) 

5.5 

4.8        4.2 
Extrapolation 

5.4 

4.8 

1971(42) 

6.1 

4.8       4.2 

5.6 

4.9 

1972(43) 

6.9 

4.8       4.1 

6.1 

4.9 

1973(44) 

7.1 

4.8        3.9 

6.6 

4.9 

1974(45) 

7.2 

4.9        3.8 

7.2 

5.0 

1975(46) 

7.5 

4.9       3.6 

7.9 

5.0 

Note. 

Indicator 

5  is  number  of  dissolutions  o 

f 

marriage 

in  Florida  expressed  as  rate  per  1,000  population 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points . 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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32 

\ 
these  equations  is  indicated  by  r2,  r2  chance,  and  S    in 

/J  y^ 

Table  12.   An  ANOVA  summary  table  is  presented  in  Table  23 
in  the  Appendix.   The  overall  F*s  for  the  quadratic  and  cubic 
forms  of  polynomial  regression  are  significant  (p<.01);  only 
the  increase  in  r2  due  to  the  quadratic  component  is  signifi- 
cant (p<.01),  however. 

The  average  errors  for  the  extrapolated  Y  values 
(S    ,    ,)  according  to  the  method  employed  are  (a)  linear, 
1.2;  (b)  quadratic,  7.8;  (c)  cubic,  1.5;  and  (d)  log-linear, 
1.4.   Observed  Y's  and  predicted  values  of  Y  for  both  the 
original  regression  and  the  extrapolated  lines  are  presented 
by  method  in  Table  13.   These  data  are  graphically  repre- 
sented in  Figure  7. 

Thus,  while  the  quadratic  and  cubic  forms  of  polynomial 
regression  best  fit  the  observed  Y's  for  the  original  regres- 
sion, they  do  not  continue  to  be  superior  predictors.   In 

fact,  the  quadratic  form  has  a  Sav.r        -.  of  7.8  while  the 

6Xt ( y • x  J 

Sextfv.\)  for  the  other  three  methods  ranges  from  1.2  to  1.5. 
No  method  is  clearly  the  best  predictor  of  Y  when  values  are 
extrapolated  beyond  the  original  regression. 

It  should  be  noted  that  the  Durbin-Watson  d  for  the 
linear  and  log-linear  methods  approaches  the  lower  limits 
of  d  and  the  possibility  of  serial  correlation  of  the  resid- 
uals cannot  be  overlooked.   Because  of  the  small  number  of 
observations  involved  in  this  data  set  (N  =  8) ,  interpreta- 
tion of  the  Durbin-Watson  d  is  more  suggestive  than  con- 
clusive. 
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Table  12 

Indicator  6:   Summary  Statistics  for 

Prediction  Equations  by  Method 


r2  change 


df 


y.x 


46628 


Linear 


5.24 


1,6 


1.04 


92655 


46026 


Quadratic 

31.54** 
31.33** 


2,5 
1,5 


42 


97205 


04550 


Cubic 

4  6.37** 
6.51 


3,4 
1,4 


29 


log  Y 

.45957  , 
ant i log  Y 

.41427 


Log- linear 


5.10 


1,6     log. 02582 

1.03 

Note .   Indicator  6  is  number  of  resident  live  births  in 
Florida  expressed  as  rate  per  1,000  population. 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

b      ? 
Both  r   and  S    have  been  recomputed  using  antilogs  of 
~        y  •  x  ^ 

the  log  Y;  much  of  the  difference  between  r2(log  Y)  and  r2 
(antilog  Y)  may  be  due  to  rounding. 

**£<.01 


84 


Table  13 
Indicator  6:   Observed  Y's  and  Predicted  Y's  by  Method 


Observed  Y's 

Predicted  Y 

•s  by 

Method 

Year(X) 

Linear   Quadratic3 

Cubic 

a 

Log-linear 

Original  regression 

1964(1) 

19.7 

18. 2C      19.5 

19.8 

18.1 

1965(2) 

17.9 

17.8       18.0 

17.8 

17.8 

1966(3) 

16.8 

17.5       16.9 

16.6 

17.4 

1967(4) 

15.9 

17.1       16.2 

16.0 

17.0 

1968(5) 

15.7 

16.7       15.8 

16.0 

16.7 

1969(6) 

16.1 

16.4       15.8 

16.1 

16.4 

1970(7) 

16.8 

16.0       16.1 

16.4 

16.0 

1971(8) 

16.4 

15.6       16.9 
Extrapolation 

16.6 

15.7 

1972(9) 

14.8 

15.3      18.0 

16.5 

15.4 

1973(10) 

13.7 

14.9       19.5 

15.9 

15.1 

1974(11) 

13.4 

14.5       22.1 

14.7 

14.7 

1975(12) 

12.5 

14.1       23.5 

12.7 

14.4 

Note  . 

Indicator  6 

is  number  of  resident 

:  live 

~¥ 

irths  in 

Florida  expressed  as  rate  per  1,000  population 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/5  of  the  known  data 
points . 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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25- 
24- 
23- 


Observed  Y ' s 

Cubic 

Linear 

Quadratic 

Log-linear 


i    i    i    i    i    i — r~ i — i — i — i 

19  64  '65  '66  '67  '68  '69  '70  '71  '72  '73  '74  '75 

Figure  7..  Indicator  6:   Observed  Y's  and  predicted  Y's  bv 
method  (vertical  line  separates  values  of 
original  regression  from  extrapolation). 
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Indicator  7 

The  mean  and  standard  deviation  for  the  Y  values  used 
to  generate  the  regression  equations  are  32.2  and  4.8,  re- 
spectively.  The  following  regression  equations  were  used  to 
derive  Y: 

Linear   Y  =  23.42857  +  1.95476  X 
Quadratic   Y  =  23.58928  +  1.85833  X  +  .01071  X2 
Cubic   Y  =  23.26430  +  2.19645  X  +  (-.07792)X2  +  .00657  X3 
Log-linear   log  Y  =  1.38414  +  .02662  X. 
The  goodness  of  fit  of  the  regression  lines  derived  from  these 

equations  is  indicated  by  r2  ,  r2  change,  and  S    in  Table  14. 

y  •  x 

An  ANOVA  summary  table  is  presented  in  Table  24  in  the  Appen- 
dix.  The  overall  F's  for  all  methods  are  significant  (p_<.01); 
however,  the  increases  in  r2  due  to  the  higher  order  polyno- 
mials are  not  significant. 

The  average  errors  for  the  extrapolated  Y  values 
^extfy-x)-*  according  to  the  method  employed  are  (a)  linear, 
1.4;  (b)  quadratic,  1.3;  (c)  cubic,  1.8;  and  (d)  log-linear, 
2.4.   Observed  and  predicted  values  of  Y  for  both  the  original 
regression  and  the  extrapolated  lines  are  presented  by  method 
in  Table  15.   These  data  are  graphically  represented  in 
Figure  8. 

There  is  very  little  difference  in  the  predictive  value 

of  the  methods  for  the  original  regression.   Each  method 

accounts  for  99%    of  the  variance  in  Y;  the  ranee  of  the  S 

b  y -x 

for  all  methods  is  from  .34  to  .43. 

When  the  lines  are  extrapolated  beyond  the  original 
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Table  14 

Indicator  7:   Summary  Statistics  for 

Prediction  Equations  by  Method 

r2        r2  change  F  df        S 

y.x 


.99572  581.74**      2,5  .37 

.00012 


1358.03** 

1,6 

Quadratic3 

581.74** 

2,5 

.14 

1,5 

Cubic 

322. 27** 

3,4 

.15 

1,4 

Log-linear 

Linear 
99560  1358.03**      1,6  .34 


99588  322.27**      3,4  .41 

.00016 


log  Y 

.99333  ,  893.91**      1,6      log. 00577 

antilog  Y 

.99999  .43 

Note.   Indicator  7  is  number  of  3  to  5  year  olds  enrolled 
in  nursery  school  and  kindergarten  expressed  as  percentage 
of  total  children  3  to  5  years  old  in  the  United  States. 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

Both  r2  and  S     have  been  recomputed  using  antilogs  of 

/v  y     a  z^ 

the  log  Y;  much  of  the  difference  between  r2(log  Y)  and  r2 

(antilog  Y)  may  be  due  to  rounding. 
**n<.01 
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Table  15 
Indicator  7:   Observed  Y's  and  Predicted  Y*s  by  Method 


Observed 

Predicted 

Y's  by  Method 

Year(X) 

Y's   Linear  Quadratic3 

1   Cubica 

Log-linear 

Original  regression 

1964(1) 

25.5 

25. 4C      25.5 

25.4 

25.7 

1965(2) 

27.1 

27.3       27.3 

27.4 

27.4 

1966(3) 

29.4 

29.3       29.3 

29.3 

29.1 

1967(4) 

31.6 

31.2       31.2 

31.2 

30.9 

1968(5) 

33.0 

33.2       33.1 

33.1 

32.9 

1969(6) 

34.6 

35.2       35.1 

35.1 

35.0 

1970(7) 

37.5 

37.1       37.1 

37.1 

37.2 

1971(8) 

39.1 

39.1       39.1 
Extrapolation 

39.2 

39. -5 

1972(9) 

41.6 

41.0      41.2 

41.5 

42.0 

1973(10) 

40.9 

43.0      43.2 

44.0 

44.7 

1974(11) 

45.2 

45.0      45.3 

46.7 

47.5 

1975(12) 

48.7 

46.9       47.4 

49.8 

50.5 

Note. 

Indicator 

7  is  number  of  3  to  5 

year  olds 

enrolled 

in  nursery  school  and  kindergarten  expressed  as  percentage  of 
total  children  3  to  5  years  old  in  the  United  States. 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points . 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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values,  the  quadratic  and  linear  methods  are  somewhat  better 
predictors  with  sextry.x)  of  1.3  and  1.4,  respectively. 
Should  the  lines  be  extrapolated  further  into  the  future, 
however,  the  cubic  and  log-linear  methods  might  improve  in 
predictive  accuracy  (see  Figure  8) . 

Indicator  8 

The  mean  and  standard  deviation  for  the  Y  values  used 
to  generate  the  regression  equations  are  7.6  and  1.2,  re- 
spectively.  The  following  regression  equations  were  used 
to  derive  Y: 

Linear   Y  =  5.48381  +  .26286  X 
Quadratic   Y  =  6.22593  +  .00093  X  +  .01637  X2 
Cubic   Y  =  6.52952  +  (-.19576)X  +  .04613  X2  +  (-.00124)X3 
Log-linear   log  Y  =  .75640  +  .01481  X. 
The  goodness  of  fit  of  the  regression  lines  derived  from 

these  equations  is  indicated  by  r2 ,  r2  change,  and  S    in 

b    '      y-x 

Table  16.   An  ANOVA  summary  table  is  presented  in  Table  25 
in  the  Appendix.   The  overall  F*s  for  all  methods  are  signifi- 
cant (p<.01);  however,  only  the  increase  in  r2  due  to  the 
quadratic  component  is  significant  (p<.01). 

The  average  errors  for  the  extrapolated  Y  values 
*-Sext  (y-x) -*  according  to  the  method  employed  are  (a)  linear, 
4.3;  (b)  quadratic,  1.9;  (c)  cubic,  4.2;  and  (d)  log-linear, 
3.7.   Observed  and  predicted  values  of  Y  for  both  the 
original  regression  and  the  extrapolated  lines  are  presented 
by  method  in  Table  17.   These  data  are  graphically  repre- 
sented in  Figure  9. 
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Table  16 

Indicator  8:   Summary  Statistics  for 

Prediction  Equations  by  Method 


r2        r2  change  F  df        S 

Linear 
.92137  152.33**       1,13        .36 

Quadratic 

.97402  224.95**       2,12        .21 

.05265  24.32**       1,12 

Cubic5 

.97822  164.65**       3,11        .20 

.00420  2.12         1,11 

Log-linear 

log  Y 

.93096  „  175.29**       1,13     log. 01872 

antilog  Y 

.89333  .30 

Note .   Indicator  8  is  number  of  children  involved  in 
divorce  or  annulment  expressed  as  rate  per  1,000  children 
under  18  years  old  in  the  United  States. 

Both  quadratic  and  cubic  forms  of  the  polynomial  regres- 
sion are  presented. 

Both  r2  and  S     have  been  recomputed  using  antilogs  of 

the  log  Y;  much  of  the  difference  between  r2 (log  Y)  and  r2 

(antilog  Y)  may  be  due  to  rounding. 
**p<.01 
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Table  17 
Indicator  8:   Observed  Y's  and  Predicted  Y's  by  Method 


Observed 

Predicted  Y 

's  by  Method 

Year(X) 

Y's   Linear   Quadratic3 

Cubic 

Log-linear 

Original  regression 

1953(1) 

6.4 

5.7C       6.2 

6.4 

5.9 

1954(2) 

6.4 

6.0       6.3 

6.3 

6.1 

1955(3) 

6.3 

6.3       6.4 

6.3 

6.3 

1956(4) 

6.3 

6.5       6.5 

6.4 

6.5 

1957(5) 

6.4 

6.8       6.6 

6.5 

6.8 

1958(6) 

6.5 

7.1       6.8 

6.7 

7.0 

1959(7) 

7.5 

7.3       7.0 

7.0 

7.2 

1960(8) 

7.2 

7.6        7.3 

7.3 

7.5 

1961(9) 

7.8 

7.8        7.6 

7.6 

7.8 

1962(10) 

7.9 

8.1        7.9 

7.9 

8.0 

1963(11) 

8.2 

8.4        8.2 

8.3 

8.3 

1964(12) 

8.7 

8.6       8.6 

8.7 

8.6 

1965(13) 

8.9 

8.9       9.0 

9.1 

8.9 

1966(14) 

9.4 

9.2        9.4 

9.4 

9.2 

1967(15) 

9.9 

9.4        9.9 
Extrapolation 

9.8 

9.5 

1970(18) 

12.5 

10.2      11.5 

10.7 

10.5 

1971(19) 

13.6 

10.5      12.1 

11.0 

10.9 

1972(20) 

14.8 

10.7       12.8 

11.1 

11.3 

1973(21) 

15.9 

11.0       13.5 

11.3 

11.7 

1974(22) 

16.4 

11.3       14.2 

11.3 

12.1 

1975(23) 

17.1 

11.5       14.9 

11.3 

12.5 

Note. 

Indicator 

8  is  number  of  children 

involved 

in 

divorce  or  annulment  expressed  as  rate  per  1,000  children  under 
18  years  old  in  the  United  States. 

Both  quadratic  and  cubic  forms  of  polynomial  regression 
are  presented. 

The  regression  line  is  derived  from  2/3  of  the  known  data 
points . 

Predicted  Y's  are  rounded  to  number  of  places  in  original 
data. 

Values  are  extrapolated  beyond  the  data  points  used  to 
generate  the  regression  equation. 
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Figure  9.   Indicator  8:   Observed  Y's  and  predicted  Y's  by 
method  (vertical  line  separates  values  of 
original  regression  from  extrapolation) . 
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All  the  methods  are  adequate  predictors  of  Y  for  the 

range  of  the  original  regression.   The  quadratic  and  cubic 

forms  of  the  polynomial  regression  are  only  slightly  better 

than  the  linear  and  log-linear  methods,  however,  accounting 

for  approximately  SI   more  variance  in  Y  with  a  S    of  .21 

y -x 

and  .20,  respectively  (compared  with  .36  and  .30  for  the 
latter  methods) . 

When  the  lines  are  extrapolated  beyond  the  original 
values,  the  quadratic  form  clearly  becomes  the  superior 
predictor  with  a  Sext(y.x]  of  1.9.   The  Sext(y<x)  for  the 
other  methods  ranges  from  3.7  to  4.3.   Based  on  examination 
of  the  plotted  values  in  Figure  9,  it  would  appear  that  the 
quadratic  form  might  continue  to  have  the  highest  predictive 
accuracy.   Of  course,  if  all  the  data  were  employed  in  the 
regression,  new  curves  influenced  by  the  later  data  points 
would  be  drawn  and  the  predictive  value  of  all  methods 
would  have  to  be  reassessed. 

The  Durbin-Watson  d  for  the  linear  and  log-linear 
methods  approaches  the  lower  limits  of  d.   The  possibility 
of  serial  correlation  of  the  residuals  thus  cannot  be  con- 
clusively rejected. 


CHAPTER  V 
DISCUSSION 

Two  objectives  of  this  study  were  (a)  to  select,  using 
Bronfenbrenner 's  ecology  of  education  model,  and  operationally 
define  at  least  10  variables  that  research  has  shown  to  be 
related  to  the  outcomes  of  education;  (b)  to  use  these 
variables  operationally  defined  as  time  series  indicators  in 
the  comparison  of  three  purely  extrapolative  forecasting 
methods.   In  this  chapter,  methodological  strategies  involved 
in  the  selection  and  operational  definition  of  social  var- 
iables as  well  as  the  use  of  the  Bronfenbrenner  model  are 
discussed  in  terms  of  their  viability  for  future  use.   Re- 
sults of  the  method  comparison  phase  are  discussed  according 
to  statistical  and  practical  considerations  derived  from  the 
literature. 

The  Variables 

Selection 

The  10  variables  selected  for  use  in  this  study  may 
need  to  be  reconsidered,  since  several  of  them  (e.g.,  stu- 
dent's sense  of  fate  control,  teacher  behavior  in  the  class- 
room) are  rather  obscure  both  in  their  constitutive  and  in 
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their  operational  definitions.   Additionally,  several  of  the 
variables  (e.g.,  administrative  leadership  style)  are  only 
peripherally  related  to  educational  outcomes. 

There  appears  to  be  a  need  to  develop  instead  a  number 
of  demographic  variables  related  to  the  family.   Presently 
there  are  very  few  efforts  to  define,  systematically  analyze, 
and  forecast  the  demographic  context  of  education  (Coates, 
Note  1;  Morrison,  1976).   Inclusion  of  social  variables 
representing  changing  patterns  in  the  family  as  to  fertility, 
stability,  employment,  income,  and  so  on  may  result  in  new 
perspectives  on  both  educational  outcomes  and  purposes.   These 
variables  may  be  examined  individually  and  then  in  clusters 
to  discover  trends  and  interdependencies  which  may  continue 
into  the  future.   Of  course,  the  selection  of  variables  need 
not  be  confined  to  demographic  variables;  rather  demographic 
variables  may  provide  a  starting  place  which  has  potential 
for  development  and  expansion. 

Bronfenbrenner's  Ecology  of  Education  Model 

Bronfenbrenner ' s  model  of  the  ecology  of  education  is  a 
conceptually  useful  representation  which  accounts  for  the 
numerous  factors  which  impinge  upon  the  educational  process. 
The  model  includes  those  areas  which  are  considered  under 
force  analysis  and  contextual  mapping,  two  strategies  used 
in  macro-level  educational  planning.   (See  Hencley  and  Yates, 
1974,  for  a  description  of  these  techniques.)   Furthermore, 
the  model  includes  those  variables  which  traditionally  have 
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been  the  concern  of  educational  researchers. 

The  need  for  a  theoretical  scheme,  or  a  model  derived 
from  theory,  when  selecting  social  indicators  for  study  has 
been  acknowledged  by  Allardt  (1971)  and  Sismondo  (1973).   The 
Bronfenbrenner  model  may  help  to  fill  this  need  for  educa- 
tional planners. 

Operational  Definition  of  Variables 

The  operational  definition  of  social  variables  in  terms 
of  time  series  indicators  is  a  challenging  and  often  frus- 
trating task.   Several  of  the  problems  acknowledged  by  Ferriss 
(1970)  and  others  involve  the  nature  and  availability  of  the 
data.   Some  data  are  thought  to  be  of  poor  quality  with  the 
basis  for  collection  changing  frequently  over  time.   Other 
data  are  collected  only  for  certain  years  or  are  made  avail- 
able only  for  certain  years.   There  are  numerous  sources  of 
data;  however,  it  requires  careful  review  and  consideration 
of  available  data  to  find  the  most  reliable  measures  for  use 
in  analysis  and  forecasting. 

Other  problems  arise  from  the  process  of  social  measure- 
ment itself.   Social  investigators  need  not  let  these  concerns 
deter  efforts  to  develop  social  indicators;  rather  these 
concerns  (as  described  by  Etzioni  and  Lehman,  1969)  might  be 
used  to  guide  future  efforts  to  operationally  define  social 
variables . 

Perhaps  the  most  important  functions  of  time  series 
indicators  are  (a)  the  comparison  of  trends  in  an  indicator 
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over  time  and  (b)  the  comparison  of  different  groups  in  re- 
lation to  a  specific  indicator.   Thus,  for  an  indicator  such 
as  births  to  unwed  mothers,  the  aggregated  data  may  be 
examined  for  a  general  societal  trend;  the  general  trend  may 
be  disaggregated  according  to  state  or  county,  and  then 
according  to  race  and  age  of  mother.   It  is  often  only  when 
the  data  are  disaggregated  (for  example,  births  to  unwed 
mothers  age  10-18)  that  they  become  most  meaningful  for  the 
particular  planning  activity.   A  graphic  presentation  of  the 
disaggregated  data  is  a  valuable  part  of  any  attempt  to  under- 
stand the  underlying  social  process  represented  by  the  time 
series  indicator. 

The  Extrapolative  Methods 

Statistical  Considerations 

The  use  of  the  general  linear  model  was  appropriate  in 
this  study.   While  violations  of  the  assumptions  did  occur 
occasionally  in  the  linear  or  log-linear  methods,  they  could 
be  corrected  by  the  addition  of  the  quadratic  and/or  cubic 
terms  to  the  regression  equation.   In  all  data  sets  one  or 
more  of  the  methods  accounted  for  significant  amounts  of  the 
variance  in  Y. 

In  two  of  the  data  sets,  however,  there  remained 
questionable  patterns  in  the  residuals  that  were  not  corrected 
by  any  of  the  methods  employed.   Coincidentally ,  the  extra- 
polated lines  of  these  indicators  (numbers  4  and  8)  were 
extremely  poor  approximations  of  the  observed  data.   The 
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Durbin-Watson  d  statistic  for  serial  correlation  of  residuals 
was  inconclusive  for  the  simple  linear  and  log-linear  methods 
for  indicators  6  and  8  but  did  not  indicate  this  problem  with 
the  other  methods  in  any  of  the  other  data  sets. 

Thus,  at  least  one  of  the  three  methods  based  on  the 
general  linear  model  was  appropriate  to  use  with  the  data 
sets  in  this  study.   The  application  of  the  general  linear 
model  to  other  data  sets  of  time  series  indicators  or  even 
to  these  data  sets  when  additional  observations  are  included 
would  require  further  investigation,  however. 

In  the  general  forecasting  literature  it  is  usually 
acknowledged  that  a  method  which  "best  fits"  the  observed 
data  should  be  used  for  extrapolative  purposes.   This  "best 
fit"  is  indicated  by  magnitude  of  the  coefficient  of  deter- 
mination r2  and  the  unbiased  standard  error  of  estimate  S 

y  x 

In  this  study,  however,  r2  and  S  <   were  not  always  good  in- 
dicators of  the  accuracy  of  extrapolated  values  to  the  observed 
data.   For  example,  while  the  cubic  form  of  the  polynomial 
best  fit  the  observed  Y's  for  the  original  regression  in 
Indicator  5,  it  was  the  worst  fit  when  expolated  into  the 
future.   In  several  other  data  sets  (namely  indicators  1  and 
3)  the  r2  and  error  were  similar  for  the  simple  linear  and 
log-linear  methods.   When  extrapolated  into  the  future,  how- 
ever, the  log-linear  clearly  was  the  most  appropriate  method. 
No  one  method  was  superior  to  the  others  overall;  each  was 
superior  in  the  extrapolation  phase  for  at  least  one  data 
set . 
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If  the  summary  statistics  of  r2  and  S    are  not  always 

'  y  •  x  ] 

useful  in  choosing  the  best  method  for  the  extrapolation  phase, 
then  on  what  basis  should  the  method  be  chosen?   Perhaps  this 
question  can  only  be  answered  by  a  combination  of  strategies: 
(a)  application  of  several  different  methods  to  the  data;  (b) 
for  each  method,  extrapolation  of  trend  for  specified  number 
of  years  into  the  future;  (c)  graphic  representation  of  origi- 
nal data  and  predicted  data  for  each  method;  (d)  visual  ex- 
amination of  data  to  determine  most  likely  direction  for  ob- 
served trend  to  take  based  on  knowledge  of  the  social  phenom- 
enon being  studied;  and  (e)  construction  of  upper  and  lower 
limits  for  the  extrapolated  trend  considered  to  be  the  most 
appropriate . 

Therefore,  while  the  mathematical  extrapolation  of  trends 
is  a  useful  approach  to  forecasting,  appropriate  application 
remains  an  art.   Indeed,  forecasting  accuracy  is  dependent 
upon  the  judgment  of  the  investigator  and  his  or  her  knowledge 
of  the  social  phenomenon. 

Lest  the  problems  presented  overshadow  the  positive 
attributes  of  mathematical  extrapolation,  it  should  be  noted 
that  linear  regression  is  a  powerful  tool  which  can  be 
employed  both  in  preliminary  analysis  of  the  data  and  in  the 
projection  of  future  trends.   Even  if  the  assumptions  are 
violated,  the  general  linear  model  may  still  be  employed  for 
analysis  of  trends;  violation  of  the  assumption,  hoAvever, 
precludes  making  any  statement  of  a  probabilistic  nature 
(Hays,  1973,  p.  636). 
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Practical  Considerations 

Evaluation  of  a  forecast  must  be  based  on  more  than 
statistical  criteria.   Other  considerations  such  as  purpose 
of  the  forecast  and  the  user's  need  for  accuracy  are  involved 
(see  Harrison,  1976;  Martino,  1973a).   Thus,  final  evalua- 
tion of  a  forecast  must  be  made  in  relation  to  the  specific 
situation  and  the  underlying  motivation  for  the  forecasting 
activity. 

The  results  of  the  mathematical  extrapolation  are 
greatly  influenced  by  the  data  points  used  in  the  original 
regression  especially  if  the  sequence  of  observations  is 
short.   Thus,  one  highly  discrepant  observation  can  change 
the  slope  of  the  line.   A  decision  must  be  made  as  to  whether 
all  data  available  should  be  included.   The  effect  of  editing 
data  is  not  always  advantageous;  in  fact,  it  may  obscure  the 
natural  periodicity  of  the  data  or  the  occurrence  of  unex- 
pected events  which  may  be  critical  to  an  understanding  of 
the  social  process.   Only  experimentation  and  human  judgment 
will  resolve  this  particular  issue. 


CHAPTER  VI 
SUMMARY,  CONCLUSIONS,  AND  IMPLICATIONS  OF  STUDY 

In  this  chapter  the  main  aspects  of  the  study  are 
summarized.   Conclusions  derived  from  an  analysis  of  the 
results  of  the  study  are  presented.   Future  directions  for 
research  suggested  by  the  results  of  the  study  are  presented, 
Implications  for  the  use  in  educational  planning  of  an 
ecological  model  such  as  Bronf enbrenner ' s ,  time  series  in- 
dicators, and  selected  extrapolative  methods  are  discussed. 

Summary 

Educational  planners  and  policy  makers  need  adequate 
information  about  the  societal  context  of  education  in  order 
to  make  appropriate  decisions  about  the  future  role  and 
function  of  education.   Given  this  need,  the  problem  in 
this  study  was  (a)  to  select,  using  Bronf enbrenner ' s  ecology 
of  education  model,  and  operationally  define  at  least  10 
variables  that  research  has  shown  to  be  related  to  the  out- 
comes of  education;  (b)  to  use  these  variables  operationally 
defined  as  time  series  indicators  in  the  comparison  of  three 
purely  extrapolative  forecasting  methods;  and  (c)  to  derive 
implications  for  the  use  of  an  ecological  model  such  as 
Bronfenbrenner's,  time  series  indicators,  and  selected 
extrapolative  methods  for  educational  planning. 
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The  Variables 

Ten  variables  related  to  the  outcomes  of  education  were 
selected;  these  variables  were  then  classified  according  to 
Bronfenbrenner 's  ecology  of  education  model.   The  variables 
were  operationally  defined  when  possible  as  time  series  in- 
dicators.  Data  were  collected  for  these  indicators,  and 
eight  indicators  which  met  the  criteria  established  in  the 
study  were  selected  for  use  in  the  method  comparison  phase. 

The  Methods 

The  literature  in  forecasting  methodology,  economics, 
statistics,  and  time  series  analysis  was  reviewed  in  order 
to  determine  the  most  appropriate  extrapolative  methods  to 
use  with  the  selected  time  series  indicators.   Three  statis- 
tical methods  derived  from  the  general  linear  model  were 
selected  for  comparison.   These  methods  included  simple 
linear  regression,  second  and  third  degree  polynomial  re- 
gression, and  log-linear  regression  (in  which  the  dependent 
variable,  usually  a  rate  or  percentage,  undergoes  logarithmic 
transformation).   Time  in  years  was  used  as  the  independent 
variable. 

Each  method  was  applied  to  each  time  series  indicator. 
Each  data  set  was  divided  into  thirds;  two- thirds  of  the 
data  points  were  used  to  establish  the  prediction  equation. 
This  equation  was  used  to  predict  the  remaining  third  of  the 
data  points.   The  methods  were  then  compared  according  to 
(a)  the  fit  of  the  predicted  values  to  the  observed  values 
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for  the  original  regression  and  (b)  the  fit  of  the  extra- 
polated values  to  observed  values.   Both  summary  statistics 
and  visual  inspection  of  the  data  were  utilized. 

Results 

Results  of  the  method  comparison  were  (a)  no  one  method 
was  a  superior  predictor  for  all  indicators;  (b)  each  method 
was  a  superior  predictor  of  at  least  one  indicator;  and  (c) 
the  summary  statistics  for  the  original  regression  were  not 
consistently  related  to  the  accuracy  of  the  extrapolated 
values.   Despite  the  limitations  noted  in  this  study,  the 
Bronfenbrenner  model,  social  variables  operationalized  as 
time  series  indicators,  and  selected  extrapolative  methods 
have  significant  potential  for  development  and  application 
in  educational  planning. 

Conclusions 

The  following  conclusions  appear  to  be  warranted  by  the 
results  of  this  study: 

1.  The  Bronfenbrenner  model  is  a  useful  framework  for 
considering  the  numerous  factors  impinging  upon  the  learner. 

2.  Time  series  indicators  provide  a  means  to  compare 
trends  in  an  indicator  over  time  or  to  compare  different 
groups  in  relation  to  a  specific  indicator. 

3.  The  general  linear  model  is  appropriate  for  the 
analysis  and  extrapolation  of  the  selected  time  series  in- 
dicators used  in  the  study. 

4.  Each  of  the  three  extrapolative  methods  is 
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appropriate  for  use  with  some  indicators  but  not  for  others. 
Measures  of  "best  fit"  such  as  r2  and  the  standard  error  of 
estimate  are  not  reliable  criteria  for  the  selection  of  an 
extrapolative  method.   A  combination  of  strategies  such  as 
graphic  representation  of  original  and  predicted  data,  analysis 
of  residuals,  and  knowledge  of  the  social  phenomena  being 
studied  may  provide  guidance  as  to  the  most  appropriate  method 
for  a  particular  indicator. 

Suggestions  for  Future  Research 

The  data  from  the  present  study  clearly  indicate  the 
potential  application  of  (a)  the  Bronf enbrenner  model,  (b) 
selected  social  variables  operationalized  as  time  series 
indicators,  and  (c)  selected  extrapolative  techniques  in 
educational  planning.   However,  a  number  of  issues  raised 
in  Chapter  V  remain  to  be  clarified  and  resolved. 

Some  of  the  issues  such  as  selection  and  operational 
definition  of  variables  are  conceptual  in  nature,  while 
others  are  methodological.   It  would  be  useful  to  replicate 
the  procedures  used  in  this  study  with  other  time  series 
indicators  or  for  other  time  periods  of  the  indicators  used 
in  this  study.   For  example,  does  it  increase  the  accuracy 
of  the  extrapolation  to  edit  the  data?   And  if  so,  by  how 
much?  At  what  price?   Many  other  questions  remain;  answers 
will  require  coordinated  research  efforts  in  a  number  of 
disciplines  in  order  to  approach  the  problems  from  a  diver- 
sity of  perspectives  and  with  a  variety  of  skills. 
In  Chapter  V  (p.  100)  it  was  suggested  that  a 
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combination  of  strategies  might  be  appropriately  utilized  to 
extrapolate  trends  in  time  series  data.   Demonstration  of  the 
application  of  such  strategies  to  actual  data  would  provide 
valuable  assistance  to  educational  planners. 

The  possibility  of  multimethod  forecasting  should  be 
explored;  that  is,  extrapolative  methods  may  be  combined  with 
an  intuitive  method  such  as  trend  impact  analysis  to  look  at 
the  effect  of  unexpected  events  on  projected  trends.   The 
probability  of  various  trends  occurring  together  may  be 
evaluated  in  a  cross  impact  matrix,  or  the  logical  consistency 
of  various  trends  might  be  examined  in  scenarios. 

Harrison  (1976)  noted  that 

the  problems  of  social  forecasting  are 
the  problems  of  all  social  science  inquiry 
related  to  social  processes,  and  the 
modes  for  resolving  these  problems  are 
in  many  ways  the  same.   (p.  80) 

Thus,  theory-based  social  forecasting  utilizing  time  series 
indicators  has  considerable  potential  application  in  educa- 
tional planning;  on-going  empirical  studies  are  needed,  how- 
ever, to  actualize  this  potential. 

Implications  for  Planners  and  Policy  Makers 

The  Bronfenbrenner  ecology  of  education  model,  social 

indicators,  and  extrapolative  methods  can  be  useful  tools 

for  educational  planners  and  policy  makers  at  the  national, 

state,  and  local  levels.   Johnson  (1975)  has  noted: 

From  the  viewpoint  of  futures  research, 
social  indicators  may  be  regarded  as  a 
means  for  developing  more  adequate  answers 
to  the  age-old  questions: 
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"What  are  we,  what  are  we  in  the 
process  of  becoming,  and  do  we  like 
what  we  see?"   (p.  10) 

Certainly  educational  planners  need  to  monitor  societal 

processes  and  interpret  change  in  relation  to  past,  present, 

and  future  educational  purposes  and  outcomes.   Policy  makers 

need  to  be  informed  of  the 

probable  or  possible  consequences  of  a 
continuation  of  observed  trends  and  of 
possible  responsiveness  of  these  trends 
to  different  sets  of  postulated  condi- 
tions.  (Johnson,  1975,  p.  10) 

Thus,  social  forecasting  strategies  such  as  those 
described  in  this  study  should  be  a  part  of  on-going  educa- 
tional planning  efforts.   Data  on  social  indicators  might 
be  incorporated  into  existing  management  information  systems, 
While  initiation  of  such  a  data  bank  would  require  a  sub- 
stantial investment  of  resources,  maintenance  would  be 
minimal.   Forecasts  could  then  be  issued  and  revised  at 
regular  intervals.   Certainly  the  use  of  social  forecasting 
strategies  can  assist  educational  planners  and  policy 
makers  in  developing  an  understanding  of  the  past,  per- 
spective in  analyzing  the  present,  and  hopefully  vision  in 
planning  for  the  future. 
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Table  13 
Indicator  1:   ANOVA  Summary  Tables  by  Method 

Source  of 

Variation  SS  df  MS  F 

Linear 

Regression      15176369.41422     1      15176369.41422   571.04** 
Residual         398650.11520    15        26576.67435 


Quadratic 

Regression      15190416.01187     2       7595208.00593   276.47** 
Residual         384603.51754    14        27471.67982 


Cubic 

Regression      15284842.00799     3       5094947.33600   228.25** 
Residual         290177.52142    13        22321.34780 


Log- linear 

Regression  .06541     1  .06541   488.54** 

Residual  .00201    15  .00013 

Note .   Indicator  1  is  median  family  income  expressed  in 
1971  constant  dollars. 
**p<.01 
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Table  19 
Indicator  2:   ANOVA  Summary  Tables  by  Method 


Source  of 

Variation  SS  df  MS 


Linear 

Regression         .68388        1        .68388       1.18 
Residual  3.48487        6        .58081 


Quadratic 

Regression        3.39409        2       1.69704      10.95* 
Residual  .77466        5         .15493 


Cubic 

Regression        3.72084        3       1.24028      11.08* 
Residual  .44791        4        .11198 


3.72084 
.44791 

Log- 

3 

4 

■linear 

.00123 
.00605 

1 
6 

Regression         .00123        1        .00123       1.22 
Residual  .00605        6        .00101 

Note.  Indicator  2  is  number  of  families  in  the  United 
States  headed  by  women  expressed  as  a  percentage  of  total 
families . 

*p_<.05 
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Table  20 
Indicator  3:   ANOVA  Summary  Tables  by  Method 


Source  of 

Variation  SS  df  MS 


Linear 

Regression      195.31547         1      195.31547       826.05** 
Residual         3.07378        13         .23644 


Quadratic 

Regression     195.33311         2      97.66655       383.49** 
Residual         3.05615        12         .25468 


Cubic 

Regression      196.31194         3       65.43731        346.51** 
Residual         2.07732        11         .18885 


Log-linear 

Regression        .03848         1        .03848       665.69** 
Residual  .00075        13        .00006 

Note .   Indicator  3  is  number  of  wives  in  the  labor  force 

expressed  as  a  percentage  of  total  wives  in  the  United  States. 

**£<.01 
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Table  21 
Indicator  4:   ANOVA  Summary  Tables  by  Method 


Source  of 
Variation 


SS 


df 


MS 


Linear 


Regression 
Residual 


30.99500 
42.65403 


1 
10 


30.99500 
4.26540 


7.27' 


Regression 
Residual 


31.36855 
42.28048 


Quadratic 

2 
9 


15.68428 
4.69783 


3.34 


Cubic 


Regression 
Residual 


66.53315 
7.11589 


3 

8 


22.17772 
.88949 


24.93** 


Regression 
Residual 


Log-linear 


04364 
05953 


I 
10 


04364 
00595 


7.33* 


Note.   Indicator  4  is  number  of  marriages  in  Florida  ex- 
pressed as  rate  per  1,000  population  in  Florida. 
*p<.05 
**p_<.01 
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Table  22 
Indicator  5:   ANOVA  Summary  Tables  by  Method 


Source  of 

Variation  SS  df  MS 


Linear 


Regression 
Residual 

0 
11, 

,50827 
,14840 

1 

10 

Quadratic 

.50827 
1.11484 

Regression 
Residual 

2. 
9. 

61121 
04545 

2 
9 

1.30561 
1.00505 

Cubic 


46 


1.30 


Regression 

10.73976         3       3.57992 

31.23** 

Residual 

.91691         8         .11461 
Log- linear 

Regression 

.01472         1         .01472 

1.36 

Residual 

.10791         10         .01079 

Note.   I 

nd 

icator  5  is  number  of  dissolutions  o 

C 

marriage 

in  Florida  expressed  as  rate  per  1,000  population, 
**p<.01 
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Table  23 
Indicator  6:   ANOVA  Summary  Tables  by  Method 


Source  of 
Variation 


SS 


df 


MS 


Linear 


Regression 
Residual 


5.68341 
6.50534 


1 
6 


5.68341 
1.08422 


5.24 


Regression 
Residual 


11.29346 
.89529 


Quadratic 

2 

5 


5.64673 
.17906 


31.54** 


Cubic 


Regression 
Residual 


11.84805 
.34070 


3 

4 


3.94935 
.08518 


46.37** 


Regression 
Residual 


00340 
00400 


Log-linear 

1 
6 


00340 
00067 


5.10 


Note.   Indicator  6  is  number  of  resident  live  births  in 
Florida  expressed  as  rate  per  1,000  population. 
**p<.01 
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Table  24 
Indicator  7:   ANOVA  Summary  Tables  by  Method 


Source  of 

Variation  SS  df  MS 


Linear 

Regression      160.48592         1      160.48592       1358.03** 
Residual  .70905         6        .11818 


Quadratic 

Regression     160.50520         2      80.25260       581.74** 
Residual  .68977         5        .13795 


Cubic 

Regression      160.53081         3       53.51027       322.27** 
Residual  .66416         4        .16604 


Log-linear 

Regression        .02975         1        .02975       893.91** 
Residual  .00020         6        .00003 

Note .   Indicator  7  is  number  of  3  to  5  year  olds  enrolled 

in  nursery  school  and  kindergarten  expressed  as  percentage  of 

total  children  3  to  5  years  old  in  the  United  States. 

**p<.01 
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Table  25 
Indicator  8:   ANOVA  Summary  Tables  by  Method 


Source  of 

Variation 

SS          df 

MS 

F 

Linear 

Regression 

19.34629        1 

19.34629 

152.33** 

Residual 

1.65105       13 
Quadratic 

.12700 

Regression 

20.45183        2 

10.22592 

224.95** 

Residual 

.54550       12 

.04546 

Cubic 


Regression 
Residual 


20.53993 
.45740 


3 

11 


6.84664 
.04158 


164.65** 


Regression 
Residual 


Log- linear 


06146 
00456 


1 
13 


.06146 
.00035 


175.29** 


Note.   Indicator  8  is  number  of  children  involved  in  divorce 
or  annulment  expressed  as  rate  per  1,000  children  under  18  years 
old  in  the  United  States. 
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